Overview

Brought to you by YData

Dataset statistics

 Dataset ADataset B
Number of variables7474
Number of observations4.3651.339
Missing cells91.12131.123
Missing cells (%)28.2%31.4%
Total size in memory2.6 MiB816.9 KiB
Average record size in memory630.3 B624.7 B

Variable types

 Dataset ADataset B
Numeric4141
Text2828
Unsupported55

Alerts

Dataset ADataset B
AgeDecade has 137 (3.1%) missing values AgeDecade has 107 (8.0%) missing values Missing
AgeMonths has 2358 (54.0%) missing values AgeMonths has 666 (49.7%) missing values Missing
Race3 has 2077 (47.6%) missing values Race3 has 722 (53.9%) missing values Missing
HHIncome has 301 (6.9%) missing values HHIncome has 173 (12.9%) missing values Missing
HHIncomeMid has 301 (6.9%) missing values HHIncomeMid has 173 (12.9%) missing values Missing
Poverty has 259 (5.9%) missing values Poverty has 154 (11.5%) missing values Missing
Length has 4365 (100.0%) missing values Length has 1339 (100.0%) missing values Missing
HeadCirc has 4365 (100.0%) missing values HeadCirc has 1339 (100.0%) missing values Missing
BMICatUnder20yrs has 4365 (100.0%) missing values BMICatUnder20yrs has 1339 (100.0%) missing values Missing
BMI_WHO has 57 (1.3%) missing values BMI_WHO has 18 (1.3%) missing values Missing
Pulse has 147 (3.4%) missing values Pulse has 48 (3.6%) missing values Missing
BPSysAve has 154 (3.5%) missing values BPSysAve has 50 (3.7%) missing values Missing
BPDiaAve has 154 (3.5%) missing values BPDiaAve has 50 (3.7%) missing values Missing
BPSys1 has 302 (6.9%) missing values BPSys1 has 96 (7.2%) missing values Missing
BPDia1 has 302 (6.9%) missing values BPDia1 has 96 (7.2%) missing values Missing
BPSys2 has 244 (5.6%) missing values BPSys2 has 85 (6.3%) missing values Missing
BPDia2 has 244 (5.6%) missing values BPDia2 has 85 (6.3%) missing values Missing
BPSys3 has 223 (5.1%) missing values BPSys3 has 79 (5.9%) missing values Missing
BPDia3 has 223 (5.1%) missing values BPDia3 has 79 (5.9%) missing values Missing
Testosterone has 2208 (50.6%) missing values Testosterone has 795 (59.4%) missing values Missing
DirectChol has 209 (4.8%) missing values DirectChol has 99 (7.4%) missing values Missing
TotChol has 209 (4.8%) missing values TotChol has 99 (7.4%) missing values Missing
UrineFlow1 has 241 (5.5%) missing values UrineFlow1 has 107 (8.0%) missing values Missing
UrineVol2 has 3634 (83.3%) missing values UrineVol2 has 1192 (89.0%) missing values Missing
UrineFlow2 has 3636 (83.3%) missing values UrineFlow2 has 1192 (89.0%) missing values Missing
DiabetesAge has 4073 (93.3%) missing values DiabetesAge has 1163 (86.9%) missing values Missing
HealthGen has 420 (9.6%) missing values HealthGen has 190 (14.2%) missing values Missing
DaysPhysHlthBad has 421 (9.6%) missing values DaysPhysHlthBad has 194 (14.5%) missing values Missing
DaysMentHlthBad has 421 (9.6%) missing values DaysMentHlthBad has 192 (14.3%) missing values Missing
LittleInterest has 3541 (81.1%) missing values LittleInterest has 1010 (75.4%) missing values Missing
Depressed has 3637 (83.3%) missing values Depressed has 1013 (75.7%) missing values Missing
nPregnancies has 2812 (64.4%) missing values nPregnancies has 887 (66.2%) missing values Missing
nBabies has 2945 (67.5%) missing values nBabies has 909 (67.9%) missing values Missing
Age1stBaby has 3295 (75.5%) missing values Age1stBaby has 972 (72.6%) missing values Missing
PhysActiveDays has 1951 (44.7%) missing values PhysActiveDays has 821 (61.3%) missing values Missing
TVHrsDay has 2078 (47.6%) missing values TVHrsDay has 723 (54.0%) missing values Missing
CompHrsDay has 2077 (47.6%) missing values CompHrsDay has 722 (53.9%) missing values Missing
TVHrsDayChild has 4365 (100.0%) missing values TVHrsDayChild has 1339 (100.0%) missing values Missing
CompHrsDayChild has 4365 (100.0%) missing values CompHrsDayChild has 1339 (100.0%) missing values Missing
Alcohol12PlusYr has 423 (9.7%) missing values Alcohol12PlusYr has 199 (14.9%) missing values Missing
AlcoholDay has 1208 (27.7%) missing values AlcoholDay has 626 (46.8%) missing values Missing
AlcoholYear has 733 (16.8%) missing values AlcoholYear has 355 (26.5%) missing values Missing
SmokeNow has 2653 (60.8%) missing values SmokeNow has 596 (44.5%) missing values Missing
SmokeAge has 2744 (62.9%) missing values SmokeAge has 613 (45.8%) missing values Missing
Marijuana has 1374 (31.5%) missing values Marijuana has 602 (45.0%) missing values Missing
AgeFirstMarij has 2568 (58.8%) missing values AgeFirstMarij has 956 (71.4%) missing values Missing
RegularMarij has 1374 (31.5%) missing values RegularMarij has 602 (45.0%) missing values Missing
AgeRegMarij has 3641 (83.4%) missing values AgeRegMarij has 1092 (81.6%) missing values Missing
HardDrugs has 835 (19.1%) missing values HardDrugs has 484 (36.1%) missing values Missing
SexEver has 837 (19.2%) missing values SexEver has 480 (35.8%) missing values Missing
SexAge has 959 (22.0%) missing values SexAge has 520 (38.8%) missing values Missing
SexNumPartnLife has 852 (19.5%) missing values SexNumPartnLife has 501 (37.4%) missing values Missing
SexNumPartYear has 1382 (31.7%) missing values SexNumPartYear has 607 (45.3%) missing values Missing
SameSex has 835 (19.1%) missing values SameSex has 481 (35.9%) missing values Missing
SexOrientation has 1406 (32.2%) missing values SexOrientation has 646 (48.2%) missing values Missing
Length is an unsupported type, check if it needs cleaning or further analysis Length is an unsupported type, check if it needs cleaning or further analysis Unsupported
HeadCirc is an unsupported type, check if it needs cleaning or further analysis HeadCirc is an unsupported type, check if it needs cleaning or further analysis Unsupported
BMICatUnder20yrs is an unsupported type, check if it needs cleaning or further analysis BMICatUnder20yrs is an unsupported type, check if it needs cleaning or further analysis Unsupported
TVHrsDayChild is an unsupported type, check if it needs cleaning or further analysis TVHrsDayChild is an unsupported type, check if it needs cleaning or further analysis Unsupported
CompHrsDayChild is an unsupported type, check if it needs cleaning or further analysis CompHrsDayChild is an unsupported type, check if it needs cleaning or further analysis Unsupported
DaysPhysHlthBad has 2614 (59.9%) zeros DaysPhysHlthBad has 704 (52.6%) zeros Zeros
DaysMentHlthBad has 2269 (52.0%) zeros DaysMentHlthBad has 663 (49.5%) zeros Zeros
AlcoholYear has 473 (10.8%) zeros AlcoholYear has 270 (20.2%) zeros Zeros
SexNumPartnLife has 139 (3.2%) zeros SexNumPartnLife has 45 (3.4%) zeros Zeros
SexNumPartYear has 518 (11.9%) zeros SexNumPartYear has 133 (9.9%) zeros Zeros
Alert not present in this datasetHomeRooms has 16 (1.2%) missing values Missing
Alert not present in this datasetHomeOwn has 14 (1.0%) missing values Missing
Alert not present in this datasetBMI has 14 (1.0%) missing values Missing
Alert not present in this datasetUrineVol1 has 32 (2.4%) missing values Missing
Alert not present in this datasetPoverty has 14 (1.0%) zeros Zeros
Alert not present in this datasetBPDia3 has 14 (1.0%) zeros Zeros

Reproduction

 Dataset ADataset B
Analysis started2025-08-31 00:31:04.5910422025-08-31 00:31:04.933906
Analysis finished2025-08-31 00:31:04.9267892025-08-31 00:31:05.222961
Duration0.34 seconds0.29 seconds
Software versionydata-profiling vv4.16.1ydata-profiling vv4.16.1
Download configurationconfig.jsonconfig.json

Variables

ID
Real number (ℝ)

 Dataset ADataset B
Distinct2669983
Distinct (%)61.1%73.4%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean62238.7429661474.35474
 Dataset ADataset B
Minimum5163051657
Maximum7191571909
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:10.912173image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum5163051657
5-th percentile52618.452254.3
Q15730556455.5
median6257561444
Q36739266614.5
95-th percentile7098470896.2
Maximum7191571909
Range2028520252
Interquartile range (IQR)1008710159

Descriptive statistics

 Dataset ADataset B
Standard deviation5863.8396445982.731972
Coefficient of variation (CV)0.094215264740.09732077705
Kurtosis-1.179869659-1.189225533
Mean62238.7429661474.35474
Median Absolute Deviation (MAD)50195070
Skewness-0.12000697530.05963696675
Sum27167211382314161
Variance34384615.3835793081.85
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:10.988711image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
62927 7
 
0.2%
69626 7
 
0.2%
63297 7
 
0.2%
63390 6
 
0.1%
63744 6
 
0.1%
68035 6
 
0.1%
60566 6
 
0.1%
64675 6
 
0.1%
70653 6
 
0.1%
63330 6
 
0.1%
Other values (2659) 4302
98.6%
ValueCountFrequency (%)
63163 6
 
0.4%
64530 6
 
0.4%
53369 5
 
0.4%
63942 5
 
0.4%
69637 5
 
0.4%
70581 4
 
0.3%
60154 4
 
0.3%
69365 4
 
0.3%
61836 4
 
0.3%
62418 4
 
0.3%
Other values (973) 1292
96.5%
ValueCountFrequency (%)
51630 1
 
< 0.1%
51647 3
0.1%
51654 1
 
< 0.1%
51656 1
 
< 0.1%
51667 1
 
< 0.1%
ValueCountFrequency (%)
51657 1
0.1%
51701 1
0.1%
51702 2
0.1%
51707 1
0.1%
51711 2
0.1%
ValueCountFrequency (%)
51657 1
< 0.1%
51701 1
< 0.1%
51702 2
< 0.1%
51707 1
< 0.1%
51711 2
< 0.1%
ValueCountFrequency (%)
51630 1
 
0.1%
51647 3
0.2%
51654 1
 
0.1%
51656 1
 
0.1%
51667 1
 
0.1%

Gender
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)< 0.1%0.1%
Missing00
Missing (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:11.073285image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length66
Median length64
Mean length5.0520045824.912621359
Min length44

Characters and Unicode

 Dataset ADataset B
Total characters22.0526.578
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowfemalemale
2nd rowmalefemale
3rd rowfemalemale
4th rowmalefemale
5th rowmalemale
ValueCountFrequency (%)
female 2296
52.6%
male 2069
47.4%
ValueCountFrequency (%)
male 728
54.4%
female 611
45.6%
2025-08-30T19:31:11.176549image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 6661
30.2%
m 4365
19.8%
a 4365
19.8%
l 4365
19.8%
f 2296
 
10.4%
ValueCountFrequency (%)
e 1950
29.6%
m 1339
20.4%
a 1339
20.4%
l 1339
20.4%
f 611
 
9.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 22052
100.0%
ValueCountFrequency (%)
(unknown) 6578
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 6661
30.2%
m 4365
19.8%
a 4365
19.8%
l 4365
19.8%
f 2296
 
10.4%
ValueCountFrequency (%)
e 1950
29.6%
m 1339
20.4%
a 1339
20.4%
l 1339
20.4%
f 611
 
9.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 22052
100.0%
ValueCountFrequency (%)
(unknown) 6578
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 6661
30.2%
m 4365
19.8%
a 4365
19.8%
l 4365
19.8%
f 2296
 
10.4%
ValueCountFrequency (%)
e 1950
29.6%
m 1339
20.4%
a 1339
20.4%
l 1339
20.4%
f 611
 
9.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 22052
100.0%
ValueCountFrequency (%)
(unknown) 6578
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 6661
30.2%
m 4365
19.8%
a 4365
19.8%
l 4365
19.8%
f 2296
 
10.4%
ValueCountFrequency (%)
e 1950
29.6%
m 1339
20.4%
a 1339
20.4%
l 1339
20.4%
f 611
 
9.3%

Age
Real number (ℝ)

 Dataset ADataset B
Distinct6161
Distinct (%)1.4%4.6%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean46.1106529250.21135176
 Dataset ADataset B
Minimum2020
Maximum8080
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:11.232807image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum2020
5-th percentile2223
Q13336
median4549
Q35865
95-th percentile7580
Maximum8080
Range6060
Interquartile range (IQR)2529

Descriptive statistics

 Dataset ADataset B
Standard deviation16.2006746617.90815272
Coefficient of variation (CV)0.35134342360.3566554592
Kurtosis-0.8495956025-1.10651191
Mean46.1106529250.21135176
Median Absolute Deviation (MAD)1314
Skewness0.29168694430.1344584573
Sum20127367233
Variance262.4618594320.7019338
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:11.303536image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
80 137
 
3.1%
39 122
 
2.8%
50 107
 
2.5%
38 103
 
2.4%
43 102
 
2.3%
36 101
 
2.3%
56 100
 
2.3%
54 99
 
2.3%
24 98
 
2.2%
33 95
 
2.2%
Other values (51) 3301
75.6%
ValueCountFrequency (%)
80 107
 
8.0%
40 33
 
2.5%
30 33
 
2.5%
42 31
 
2.3%
48 31
 
2.3%
39 31
 
2.3%
46 30
 
2.2%
63 30
 
2.2%
51 30
 
2.2%
45 28
 
2.1%
Other values (51) 955
71.3%
ValueCountFrequency (%)
20 81
1.9%
21 72
1.6%
22 74
1.7%
23 73
1.7%
24 98
2.2%
ValueCountFrequency (%)
20 19
1.4%
21 22
1.6%
22 17
1.3%
23 15
1.1%
24 17
1.3%
ValueCountFrequency (%)
20 19
0.4%
21 22
0.5%
22 17
0.4%
23 15
0.3%
24 17
0.4%
ValueCountFrequency (%)
20 81
6.0%
21 72
5.4%
22 74
5.5%
23 73
5.5%
24 98
7.3%

AgeDecade
['Text', 'Text']

 Dataset ADataset B
Distinct66
Distinct (%)0.1%0.5%
Missing137107
Missing (%)3.1%8.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:11.395798image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length66
Median length66
Mean length5.8642384115.743506494
Min length44

Characters and Unicode

 Dataset ADataset B
Total characters24.7947.076
Distinct characters1111
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st row 20-29 30-39
2nd row 60-69 40-49
3rd row 70+ 70+
4th row 40-49 50-59
5th row 50-59 60-69
ValueCountFrequency (%)
30-39 899
21.3%
40-49 836
19.8%
20-29 830
19.6%
50-59 803
19.0%
60-69 573
13.6%
70 287
 
6.8%
ValueCountFrequency (%)
40-49 256
20.8%
50-59 230
18.7%
30-39 215
17.5%
20-29 209
17.0%
60-69 164
13.3%
70 158
12.8%
2025-08-30T19:31:11.516972image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4228
17.1%
0 4228
17.1%
- 3941
15.9%
9 3941
15.9%
3 1798
7.3%
4 1672
 
6.7%
2 1660
 
6.7%
5 1606
 
6.5%
6 1146
 
4.6%
7 287
 
1.2%
ValueCountFrequency (%)
1232
17.4%
0 1232
17.4%
- 1074
15.2%
9 1074
15.2%
4 512
7.2%
5 460
 
6.5%
3 430
 
6.1%
2 418
 
5.9%
6 328
 
4.6%
7 158
 
2.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 24794
100.0%
ValueCountFrequency (%)
(unknown) 7076
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4228
17.1%
0 4228
17.1%
- 3941
15.9%
9 3941
15.9%
3 1798
7.3%
4 1672
 
6.7%
2 1660
 
6.7%
5 1606
 
6.5%
6 1146
 
4.6%
7 287
 
1.2%
ValueCountFrequency (%)
1232
17.4%
0 1232
17.4%
- 1074
15.2%
9 1074
15.2%
4 512
7.2%
5 460
 
6.5%
3 430
 
6.1%
2 418
 
5.9%
6 328
 
4.6%
7 158
 
2.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 24794
100.0%
ValueCountFrequency (%)
(unknown) 7076
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4228
17.1%
0 4228
17.1%
- 3941
15.9%
9 3941
15.9%
3 1798
7.3%
4 1672
 
6.7%
2 1660
 
6.7%
5 1606
 
6.5%
6 1146
 
4.6%
7 287
 
1.2%
ValueCountFrequency (%)
1232
17.4%
0 1232
17.4%
- 1074
15.2%
9 1074
15.2%
4 512
7.2%
5 460
 
6.5%
3 430
 
6.1%
2 418
 
5.9%
6 328
 
4.6%
7 158
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 24794
100.0%
ValueCountFrequency (%)
(unknown) 7076
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4228
17.1%
0 4228
17.1%
- 3941
15.9%
9 3941
15.9%
3 1798
7.3%
4 1672
 
6.7%
2 1660
 
6.7%
5 1606
 
6.5%
6 1146
 
4.6%
7 287
 
1.2%
ValueCountFrequency (%)
1232
17.4%
0 1232
17.4%
- 1074
15.2%
9 1074
15.2%
4 512
7.2%
5 460
 
6.5%
3 430
 
6.1%
2 418
 
5.9%
6 328
 
4.6%
7 158
 
2.2%

AgeMonths
Real number (ℝ)

 Dataset ADataset B
Distinct579358
Distinct (%)28.8%53.2%
Missing2358666
Missing (%)54.0%49.7%
Infinite00
Infinite (%)0.0%0.0%
Mean544.8325859569.8528975
 Dataset ADataset B
Minimum240240
Maximum955958
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:11.574805image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum240240
5-th percentile276269.6
Q1401.5383
median529560
Q3675729
95-th percentile867901.4
Maximum955958
Range715718
Interquartile range (IQR)273.5346

Descriptive statistics

 Dataset ADataset B
Standard deviation179.2621297201.2157263
Coefficient of variation (CV)0.32902240860.3531011726
Kurtosis-0.8283895594-1.09201729
Mean544.8325859569.8528975
Median Absolute Deviation (MAD)139173
Skewness0.27737776240.1494733196
Sum1093479383511
Variance32134.9111440487.76851
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:11.647978image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
532 16
 
0.4%
474 15
 
0.3%
498 14
 
0.3%
464 13
 
0.3%
614 13
 
0.3%
313 13
 
0.3%
480 12
 
0.3%
478 12
 
0.3%
439 12
 
0.3%
469 11
 
0.3%
Other values (569) 1876
43.0%
(Missing) 2358
54.0%
ValueCountFrequency (%)
471 8
 
0.6%
924 7
 
0.5%
560 7
 
0.5%
357 6
 
0.4%
348 6
 
0.4%
513 5
 
0.4%
891 5
 
0.4%
892 5
 
0.4%
800 5
 
0.4%
718 5
 
0.4%
Other values (348) 614
45.9%
(Missing) 666
49.7%
ValueCountFrequency (%)
240 3
0.1%
241 5
0.1%
242 5
0.1%
243 3
0.1%
244 2
 
< 0.1%
ValueCountFrequency (%)
240 1
0.1%
241 2
0.1%
242 2
0.1%
243 2
0.1%
246 2
0.1%
ValueCountFrequency (%)
240 1
< 0.1%
241 2
< 0.1%
242 2
< 0.1%
243 2
< 0.1%
246 2
< 0.1%
ValueCountFrequency (%)
240 3
0.2%
241 5
0.4%
242 5
0.4%
243 3
0.2%
244 2
 
0.1%

Race1
['Text', 'Text']

 Dataset ADataset B
Distinct55
Distinct (%)0.1%0.4%
Missing00
Missing (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:11.726650image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length88
Median length55
Mean length5.2043528065.781926811
Min length55

Characters and Unicode

 Dataset ADataset B
Total characters22.7177.742
Distinct characters1818
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowWhiteWhite
2nd rowOtherWhite
3rd rowWhiteMexican
4th rowWhiteWhite
5th rowWhiteMexican
ValueCountFrequency (%)
white 3188
73.0%
black 422
 
9.7%
other 403
 
9.2%
hispanic 188
 
4.3%
mexican 164
 
3.8%
ValueCountFrequency (%)
white 622
46.5%
mexican 315
23.5%
black 178
 
13.3%
hispanic 139
 
10.4%
other 85
 
6.3%
2025-08-30T19:31:11.832750image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3755
16.5%
i 3728
16.4%
t 3591
15.8%
h 3591
15.8%
W 3188
14.0%
a 774
 
3.4%
c 774
 
3.4%
k 422
 
1.9%
l 422
 
1.9%
B 422
 
1.9%
Other values (8) 2050
9.0%
ValueCountFrequency (%)
i 1215
15.7%
e 1022
13.2%
t 707
9.1%
h 707
9.1%
c 632
8.2%
a 632
8.2%
W 622
8.0%
n 454
 
5.9%
x 315
 
4.1%
M 315
 
4.1%
Other values (8) 1121
14.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 22717
100.0%
ValueCountFrequency (%)
(unknown) 7742
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 3755
16.5%
i 3728
16.4%
t 3591
15.8%
h 3591
15.8%
W 3188
14.0%
a 774
 
3.4%
c 774
 
3.4%
k 422
 
1.9%
l 422
 
1.9%
B 422
 
1.9%
Other values (8) 2050
9.0%
ValueCountFrequency (%)
i 1215
15.7%
e 1022
13.2%
t 707
9.1%
h 707
9.1%
c 632
8.2%
a 632
8.2%
W 622
8.0%
n 454
 
5.9%
x 315
 
4.1%
M 315
 
4.1%
Other values (8) 1121
14.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 22717
100.0%
ValueCountFrequency (%)
(unknown) 7742
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 3755
16.5%
i 3728
16.4%
t 3591
15.8%
h 3591
15.8%
W 3188
14.0%
a 774
 
3.4%
c 774
 
3.4%
k 422
 
1.9%
l 422
 
1.9%
B 422
 
1.9%
Other values (8) 2050
9.0%
ValueCountFrequency (%)
i 1215
15.7%
e 1022
13.2%
t 707
9.1%
h 707
9.1%
c 632
8.2%
a 632
8.2%
W 622
8.0%
n 454
 
5.9%
x 315
 
4.1%
M 315
 
4.1%
Other values (8) 1121
14.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 22717
100.0%
ValueCountFrequency (%)
(unknown) 7742
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 3755
16.5%
i 3728
16.4%
t 3591
15.8%
h 3591
15.8%
W 3188
14.0%
a 774
 
3.4%
c 774
 
3.4%
k 422
 
1.9%
l 422
 
1.9%
B 422
 
1.9%
Other values (8) 2050
9.0%
ValueCountFrequency (%)
i 1215
15.7%
e 1022
13.2%
t 707
9.1%
h 707
9.1%
c 632
8.2%
a 632
8.2%
W 622
8.0%
n 454
 
5.9%
x 315
 
4.1%
M 315
 
4.1%
Other values (8) 1121
14.5%

Race3
['Text', 'Text']

 Dataset ADataset B
Distinct66
Distinct (%)0.3%1.0%
Missing2077722
Missing (%)47.6%53.9%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:11.891779image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length88
Median length55
Mean length5.1813811195.858995138
Min length55

Characters and Unicode

 Dataset ADataset B
Total characters11.8553.615
Distinct characters1919
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowWhiteWhite
2nd rowWhiteWhite
3rd rowWhiteBlack
4th rowBlackBlack
5th rowWhiteWhite
ValueCountFrequency (%)
white 1669
72.9%
black 228
 
10.0%
asian 166
 
7.3%
hispanic 87
 
3.8%
mexican 77
 
3.4%
other 61
 
2.7%
ValueCountFrequency (%)
white 282
45.7%
mexican 133
21.6%
hispanic 88
 
14.3%
black 74
 
12.0%
asian 28
 
4.5%
other 12
 
1.9%
2025-08-30T19:31:11.994413image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2086
17.6%
e 1807
15.2%
h 1730
14.6%
t 1730
14.6%
W 1669
14.1%
a 558
 
4.7%
c 392
 
3.3%
n 330
 
2.8%
s 253
 
2.1%
k 228
 
1.9%
Other values (9) 1072
9.0%
ValueCountFrequency (%)
i 619
17.1%
e 427
11.8%
a 323
8.9%
c 295
8.2%
t 294
8.1%
h 294
8.1%
W 282
7.8%
n 249
6.9%
x 133
 
3.7%
M 133
 
3.7%
Other values (9) 566
15.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11855
100.0%
ValueCountFrequency (%)
(unknown) 3615
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 2086
17.6%
e 1807
15.2%
h 1730
14.6%
t 1730
14.6%
W 1669
14.1%
a 558
 
4.7%
c 392
 
3.3%
n 330
 
2.8%
s 253
 
2.1%
k 228
 
1.9%
Other values (9) 1072
9.0%
ValueCountFrequency (%)
i 619
17.1%
e 427
11.8%
a 323
8.9%
c 295
8.2%
t 294
8.1%
h 294
8.1%
W 282
7.8%
n 249
6.9%
x 133
 
3.7%
M 133
 
3.7%
Other values (9) 566
15.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11855
100.0%
ValueCountFrequency (%)
(unknown) 3615
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 2086
17.6%
e 1807
15.2%
h 1730
14.6%
t 1730
14.6%
W 1669
14.1%
a 558
 
4.7%
c 392
 
3.3%
n 330
 
2.8%
s 253
 
2.1%
k 228
 
1.9%
Other values (9) 1072
9.0%
ValueCountFrequency (%)
i 619
17.1%
e 427
11.8%
a 323
8.9%
c 295
8.2%
t 294
8.1%
h 294
8.1%
W 282
7.8%
n 249
6.9%
x 133
 
3.7%
M 133
 
3.7%
Other values (9) 566
15.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11855
100.0%
ValueCountFrequency (%)
(unknown) 3615
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 2086
17.6%
e 1807
15.2%
h 1730
14.6%
t 1730
14.6%
W 1669
14.1%
a 558
 
4.7%
c 392
 
3.3%
n 330
 
2.8%
s 253
 
2.1%
k 228
 
1.9%
Other values (9) 1072
9.0%
ValueCountFrequency (%)
i 619
17.1%
e 427
11.8%
a 323
8.9%
c 295
8.2%
t 294
8.1%
h 294
8.1%
W 282
7.8%
n 249
6.9%
x 133
 
3.7%
M 133
 
3.7%
Other values (9) 566
15.7%

Education
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)< 0.1%0.1%
Missing00
Missing (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:12.062948image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length1111
Median length1111
Mean length119.989544436
Min length118

Characters and Unicode

 Dataset ADataset B
Total characters48.01513.376
Distinct characters1111
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowCollegeGrad8thGrade
2nd rowSomeCollege8thGrade
3rd rowCollegeGrad8thGrade
4th rowSomeCollege8thGrade
5th rowSomeCollege9_11thGrade
ValueCountFrequency (%)
somecollege 2267
51.9%
collegegrad 2098
48.1%
ValueCountFrequency (%)
9_11thgrade 888
66.3%
8thgrade 451
33.7%
2025-08-30T19:31:12.174225image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 10997
22.9%
l 8730
18.2%
o 6632
13.8%
C 4365
 
9.1%
g 4365
 
9.1%
S 2267
 
4.7%
m 2267
 
4.7%
G 2098
 
4.4%
r 2098
 
4.4%
a 2098
 
4.4%
ValueCountFrequency (%)
1 1776
13.3%
t 1339
10.0%
h 1339
10.0%
G 1339
10.0%
r 1339
10.0%
a 1339
10.0%
d 1339
10.0%
e 1339
10.0%
9 888
6.6%
_ 888
6.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 48015
100.0%
ValueCountFrequency (%)
(unknown) 13376
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 10997
22.9%
l 8730
18.2%
o 6632
13.8%
C 4365
 
9.1%
g 4365
 
9.1%
S 2267
 
4.7%
m 2267
 
4.7%
G 2098
 
4.4%
r 2098
 
4.4%
a 2098
 
4.4%
ValueCountFrequency (%)
1 1776
13.3%
t 1339
10.0%
h 1339
10.0%
G 1339
10.0%
r 1339
10.0%
a 1339
10.0%
d 1339
10.0%
e 1339
10.0%
9 888
6.6%
_ 888
6.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 48015
100.0%
ValueCountFrequency (%)
(unknown) 13376
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 10997
22.9%
l 8730
18.2%
o 6632
13.8%
C 4365
 
9.1%
g 4365
 
9.1%
S 2267
 
4.7%
m 2267
 
4.7%
G 2098
 
4.4%
r 2098
 
4.4%
a 2098
 
4.4%
ValueCountFrequency (%)
1 1776
13.3%
t 1339
10.0%
h 1339
10.0%
G 1339
10.0%
r 1339
10.0%
a 1339
10.0%
d 1339
10.0%
e 1339
10.0%
9 888
6.6%
_ 888
6.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 48015
100.0%
ValueCountFrequency (%)
(unknown) 13376
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 10997
22.9%
l 8730
18.2%
o 6632
13.8%
C 4365
 
9.1%
g 4365
 
9.1%
S 2267
 
4.7%
m 2267
 
4.7%
G 2098
 
4.4%
r 2098
 
4.4%
a 2098
 
4.4%
ValueCountFrequency (%)
1 1776
13.3%
t 1339
10.0%
h 1339
10.0%
G 1339
10.0%
r 1339
10.0%
a 1339
10.0%
d 1339
10.0%
e 1339
10.0%
9 888
6.6%
_ 888
6.6%

MaritalStatus
['Text', 'Text']

 Dataset ADataset B
Distinct66
Distinct (%)0.1%0.4%
Missing30
Missing (%)0.1%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:12.242774image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length1212
Median length77
Mean length8.3755158188.412247946
Min length77

Characters and Unicode

 Dataset ADataset B
Total characters36.53411.264
Distinct characters1919
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowMarriedMarried
2nd rowMarriedMarried
3rd rowMarriedMarried
4th rowMarriedWidowed
5th rowMarriedMarried
ValueCountFrequency (%)
married 2522
57.8%
nevermarried 864
 
19.8%
divorced 418
 
9.6%
livepartner 272
 
6.2%
widowed 199
 
4.6%
separated 87
 
2.0%
ValueCountFrequency (%)
married 647
48.3%
nevermarried 211
 
15.8%
widowed 156
 
11.7%
livepartner 151
 
11.3%
divorced 116
 
8.7%
separated 58
 
4.3%
2025-08-30T19:31:12.359909image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 8685
23.8%
e 6449
17.7%
d 4289
11.7%
i 4275
11.7%
a 3832
10.5%
M 3386
 
9.3%
v 1554
 
4.3%
N 864
 
2.4%
o 617
 
1.7%
D 418
 
1.1%
Other values (9) 2165
 
5.9%
ValueCountFrequency (%)
r 2403
21.3%
e 1970
17.5%
d 1344
11.9%
i 1281
11.4%
a 1125
10.0%
M 858
 
7.6%
v 478
 
4.2%
o 272
 
2.4%
N 211
 
1.9%
t 209
 
1.9%
Other values (9) 1113
9.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 36534
100.0%
ValueCountFrequency (%)
(unknown) 11264
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
r 8685
23.8%
e 6449
17.7%
d 4289
11.7%
i 4275
11.7%
a 3832
10.5%
M 3386
 
9.3%
v 1554
 
4.3%
N 864
 
2.4%
o 617
 
1.7%
D 418
 
1.1%
Other values (9) 2165
 
5.9%
ValueCountFrequency (%)
r 2403
21.3%
e 1970
17.5%
d 1344
11.9%
i 1281
11.4%
a 1125
10.0%
M 858
 
7.6%
v 478
 
4.2%
o 272
 
2.4%
N 211
 
1.9%
t 209
 
1.9%
Other values (9) 1113
9.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 36534
100.0%
ValueCountFrequency (%)
(unknown) 11264
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
r 8685
23.8%
e 6449
17.7%
d 4289
11.7%
i 4275
11.7%
a 3832
10.5%
M 3386
 
9.3%
v 1554
 
4.3%
N 864
 
2.4%
o 617
 
1.7%
D 418
 
1.1%
Other values (9) 2165
 
5.9%
ValueCountFrequency (%)
r 2403
21.3%
e 1970
17.5%
d 1344
11.9%
i 1281
11.4%
a 1125
10.0%
M 858
 
7.6%
v 478
 
4.2%
o 272
 
2.4%
N 211
 
1.9%
t 209
 
1.9%
Other values (9) 1113
9.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 36534
100.0%
ValueCountFrequency (%)
(unknown) 11264
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
r 8685
23.8%
e 6449
17.7%
d 4289
11.7%
i 4275
11.7%
a 3832
10.5%
M 3386
 
9.3%
v 1554
 
4.3%
N 864
 
2.4%
o 617
 
1.7%
D 418
 
1.1%
Other values (9) 2165
 
5.9%
ValueCountFrequency (%)
r 2403
21.3%
e 1970
17.5%
d 1344
11.9%
i 1281
11.4%
a 1125
10.0%
M 858
 
7.6%
v 478
 
4.2%
o 272
 
2.4%
N 211
 
1.9%
t 209
 
1.9%
Other values (9) 1113
9.9%

HHIncome
['Text', 'Text']

 Dataset ADataset B
Distinct1212
Distinct (%)0.3%1.0%
Missing301173
Missing (%)6.9%12.9%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:12.447093image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length1111
Median length1111
Mean length10.6082677210.71012007
Min length77

Characters and Unicode

 Dataset ADataset B
Total characters43.11212.488
Distinct characters1515
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowmore 9999920000-24999
2nd row10000-1499915000-19999
3rd row75000-9999965000-74999
4th row35000-4499965000-74999
5th row75000-99999 5000-9999
ValueCountFrequency (%)
more 1357
25.0%
99999 1357
25.0%
75000-99999 604
11.1%
45000-54999 350
 
6.5%
25000-34999 341
 
6.3%
35000-44999 319
 
5.9%
65000-74999 282
 
5.2%
55000-64999 275
 
5.1%
20000-24999 160
 
3.0%
10000-14999 142
 
2.6%
Other values (3) 234
 
4.3%
ValueCountFrequency (%)
25000-34999 186
15.1%
15000-19999 143
11.6%
20000-24999 136
11.0%
10000-14999 127
10.3%
35000-44999 119
9.6%
45000-54999 76
6.2%
75000-99999 71
 
5.8%
5000-9999 70
 
5.7%
more 68
 
5.5%
99999 68
 
5.5%
Other values (3) 170
13.8%
2025-08-30T19:31:12.589341image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 16302
37.8%
0 8331
19.3%
5 2984
 
6.9%
- 2707
 
6.3%
4 2584
 
6.0%
1454
 
3.4%
m 1357
 
3.1%
o 1357
 
3.1%
r 1357
 
3.1%
e 1357
 
3.1%
Other values (5) 3322
 
7.7%
ValueCountFrequency (%)
9 3989
31.9%
0 3457
27.7%
- 1098
 
8.8%
4 1009
 
8.1%
5 928
 
7.4%
1 540
 
4.3%
2 458
 
3.7%
3 305
 
2.4%
188
 
1.5%
7 124
 
1.0%
Other values (5) 392
 
3.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 43112
100.0%
ValueCountFrequency (%)
(unknown) 12488
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
9 16302
37.8%
0 8331
19.3%
5 2984
 
6.9%
- 2707
 
6.3%
4 2584
 
6.0%
1454
 
3.4%
m 1357
 
3.1%
o 1357
 
3.1%
r 1357
 
3.1%
e 1357
 
3.1%
Other values (5) 3322
 
7.7%
ValueCountFrequency (%)
9 3989
31.9%
0 3457
27.7%
- 1098
 
8.8%
4 1009
 
8.1%
5 928
 
7.4%
1 540
 
4.3%
2 458
 
3.7%
3 305
 
2.4%
188
 
1.5%
7 124
 
1.0%
Other values (5) 392
 
3.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 43112
100.0%
ValueCountFrequency (%)
(unknown) 12488
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
9 16302
37.8%
0 8331
19.3%
5 2984
 
6.9%
- 2707
 
6.3%
4 2584
 
6.0%
1454
 
3.4%
m 1357
 
3.1%
o 1357
 
3.1%
r 1357
 
3.1%
e 1357
 
3.1%
Other values (5) 3322
 
7.7%
ValueCountFrequency (%)
9 3989
31.9%
0 3457
27.7%
- 1098
 
8.8%
4 1009
 
8.1%
5 928
 
7.4%
1 540
 
4.3%
2 458
 
3.7%
3 305
 
2.4%
188
 
1.5%
7 124
 
1.0%
Other values (5) 392
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 43112
100.0%
ValueCountFrequency (%)
(unknown) 12488
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
9 16302
37.8%
0 8331
19.3%
5 2984
 
6.9%
- 2707
 
6.3%
4 2584
 
6.0%
1454
 
3.4%
m 1357
 
3.1%
o 1357
 
3.1%
r 1357
 
3.1%
e 1357
 
3.1%
Other values (5) 3322
 
7.7%
ValueCountFrequency (%)
9 3989
31.9%
0 3457
27.7%
- 1098
 
8.8%
4 1009
 
8.1%
5 928
 
7.4%
1 540
 
4.3%
2 458
 
3.7%
3 305
 
2.4%
188
 
1.5%
7 124
 
1.0%
Other values (5) 392
 
3.1%

HHIncomeMid
Real number (ℝ)

 Dataset ADataset B
Distinct1212
Distinct (%)0.3%1.0%
Missing301173
Missing (%)6.9%12.9%
Infinite00
Infinite (%)0.0%0.0%
Mean67310.531536605.91767
 Dataset ADataset B
Minimum25002500
Maximum100000100000
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:12.632003image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum25002500
5-th percentile125007500
Q14000017500
median7000030000
Q310000050000
95-th percentile100000100000
Maximum100000100000
Range9750097500
Interquartile range (IQR)6000032500

Descriptive statistics

 Dataset ADataset B
Standard deviation31353.0274926957.27297
Coefficient of variation (CV)0.46579676010.736418445
Kurtosis-1.2792216310.01258436765
Mean67310.531536605.91767
Median Absolute Deviation (MAD)3000012500
Skewness-0.41464376070.9999381321
Sum27355000042682500
Variance983012333.1726694565.8
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:12.673629image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
100000 1357
31.1%
87500 604
13.8%
50000 350
 
8.0%
30000 341
 
7.8%
40000 319
 
7.3%
70000 282
 
6.5%
60000 275
 
6.3%
22500 160
 
3.7%
12500 142
 
3.3%
17500 137
 
3.1%
Other values (2) 97
 
2.2%
(Missing) 301
 
6.9%
ValueCountFrequency (%)
30000 186
13.9%
17500 143
10.7%
22500 136
10.2%
12500 127
9.5%
40000 119
8.9%
50000 76
5.7%
87500 71
 
5.3%
7500 70
 
5.2%
100000 68
 
5.1%
60000 67
 
5.0%
Other values (2) 103
7.7%
(Missing) 173
12.9%
ValueCountFrequency (%)
2500 46
 
1.1%
7500 51
 
1.2%
12500 142
3.3%
17500 137
3.1%
22500 160
3.7%
ValueCountFrequency (%)
2500 50
 
3.7%
7500 70
5.2%
12500 127
9.5%
17500 143
10.7%
22500 136
10.2%
ValueCountFrequency (%)
2500 50
 
1.1%
7500 70
1.6%
12500 127
2.9%
17500 143
3.3%
22500 136
3.1%
ValueCountFrequency (%)
2500 46
 
3.4%
7500 51
 
3.8%
12500 142
10.6%
17500 137
10.2%
22500 160
11.9%

Poverty
Real number (ℝ)

 Dataset ADataset B
Distinct391280
Distinct (%)9.5%23.6%
Missing259154
Missing (%)5.9%11.5%
Infinite00
Infinite (%)0.0%0.0%
Mean3.4545104721.747974684
 Dataset ADataset B
Minimum00
Maximum55
Zeros2314
Zeros (%)0.5%1.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:12.734527image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile0.720.234
Q12.030.82
median3.851.31
Q352.45
95-th percentile55
Maximum55
Range55
Interquartile range (IQR)2.971.63

Descriptive statistics

 Dataset ADataset B
Standard deviation1.5769053291.3157036
Coefficient of variation (CV)0.45647721780.7527017479
Kurtosis-1.1311875890.272914025
Mean3.4545104721.747974684
Median Absolute Deviation (MAD)1.150.63
Skewness-0.53957228611.073458618
Sum14184.222071.35
Variance2.4866304181.731075962
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:12.807469image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5 1442
33.0%
3.51 41
 
0.9%
3.4 34
 
0.8%
0.92 32
 
0.7%
3.43 32
 
0.7%
2.75 31
 
0.7%
4.54 31
 
0.7%
1.31 29
 
0.7%
3.97 28
 
0.6%
4.59 27
 
0.6%
Other values (381) 2379
54.5%
(Missing) 259
 
5.9%
ValueCountFrequency (%)
5 62
 
4.6%
0.81 16
 
1.2%
1.63 15
 
1.1%
0.89 14
 
1.0%
0.82 14
 
1.0%
0 14
 
1.0%
1.24 14
 
1.0%
1.37 14
 
1.0%
1.11 13
 
1.0%
1.02 13
 
1.0%
Other values (270) 996
74.4%
(Missing) 154
 
11.5%
ValueCountFrequency (%)
0 23
0.5%
0.01 3
 
0.1%
0.03 1
 
< 0.1%
0.04 1
 
< 0.1%
0.05 2
 
< 0.1%
ValueCountFrequency (%)
0 14
1.0%
0.01 3
 
0.2%
0.02 2
 
0.1%
0.03 4
 
0.3%
0.04 2
 
0.1%
ValueCountFrequency (%)
0 14
0.3%
0.01 3
 
0.1%
0.02 2
 
< 0.1%
0.03 4
 
0.1%
0.04 2
 
< 0.1%
ValueCountFrequency (%)
0 23
1.7%
0.01 3
 
0.2%
0.03 1
 
0.1%
0.04 1
 
0.1%
0.05 2
 
0.1%

HomeRooms
Real number (ℝ)

 Dataset ADataset B
Distinct1313
Distinct (%)0.3%1.0%
Missing2316
Missing (%)0.5%1.2%
Infinite00
Infinite (%)0.0%0.0%
Mean6.48664215.287981859
 Dataset ADataset B
Minimum11
Maximum1313
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:12.858654image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum11
5-th percentile33
Q154
median65
Q386
95-th percentile119
Maximum1313
Range1212
Interquartile range (IQR)32

Descriptive statistics

 Dataset ADataset B
Standard deviation2.3503701541.844827358
Coefficient of variation (CV)0.36234003940.348871726
Kurtosis-0.074475980251.069539551
Mean6.48664215.287981859
Median Absolute Deviation (MAD)21
Skewness0.3445034970.7235017307
Sum281656996
Variance5.5242398623.403387982
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:12.900907image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
7 722
16.5%
6 692
15.9%
5 630
14.4%
4 585
13.4%
8 544
12.5%
9 324
7.4%
3 257
 
5.9%
10 256
 
5.9%
11 123
 
2.8%
13 59
 
1.4%
Other values (3) 150
 
3.4%
ValueCountFrequency (%)
4 309
23.1%
5 293
21.9%
6 237
17.7%
7 166
12.4%
3 143
10.7%
8 63
 
4.7%
9 35
 
2.6%
2 26
 
1.9%
10 17
 
1.3%
11 16
 
1.2%
Other values (3) 18
 
1.3%
(Missing) 16
 
1.2%
ValueCountFrequency (%)
1 50
 
1.1%
2 50
 
1.1%
3 257
5.9%
4 585
13.4%
5 630
14.4%
ValueCountFrequency (%)
1 14
 
1.0%
2 26
 
1.9%
3 143
10.7%
4 309
23.1%
5 293
21.9%
ValueCountFrequency (%)
1 14
 
0.3%
2 26
 
0.6%
3 143
3.3%
4 309
7.1%
5 293
6.7%
ValueCountFrequency (%)
1 50
 
3.7%
2 50
 
3.7%
3 257
19.2%
4 585
43.7%
5 630
47.1%

HomeOwn
['Text', 'Text']

 Dataset ADataset B
Distinct33
Distinct (%)0.1%0.2%
Missing1814
Missing (%)0.4%1.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:12.955863image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length55
Median length33
Mean length3.3181504493.498113208
Min length33

Characters and Unicode

 Dataset ADataset B
Total characters14.4244.635
Distinct characters88
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowOwnOwn
2nd rowOwnRent
3rd rowOwnOwn
4th rowRentRent
5th rowOwnRent
ValueCountFrequency (%)
own 3063
70.5%
rent 1185
 
27.3%
other 99
 
2.3%
ValueCountFrequency (%)
own 704
53.1%
rent 582
43.9%
other 39
 
2.9%
2025-08-30T19:31:13.053192image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 4248
29.5%
O 3162
21.9%
w 3063
21.2%
e 1284
 
8.9%
t 1284
 
8.9%
R 1185
 
8.2%
h 99
 
0.7%
r 99
 
0.7%
ValueCountFrequency (%)
n 1286
27.7%
O 743
16.0%
w 704
15.2%
e 621
13.4%
t 621
13.4%
R 582
12.6%
h 39
 
0.8%
r 39
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 14424
100.0%
ValueCountFrequency (%)
(unknown) 4635
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 4248
29.5%
O 3162
21.9%
w 3063
21.2%
e 1284
 
8.9%
t 1284
 
8.9%
R 1185
 
8.2%
h 99
 
0.7%
r 99
 
0.7%
ValueCountFrequency (%)
n 1286
27.7%
O 743
16.0%
w 704
15.2%
e 621
13.4%
t 621
13.4%
R 582
12.6%
h 39
 
0.8%
r 39
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 14424
100.0%
ValueCountFrequency (%)
(unknown) 4635
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 4248
29.5%
O 3162
21.9%
w 3063
21.2%
e 1284
 
8.9%
t 1284
 
8.9%
R 1185
 
8.2%
h 99
 
0.7%
r 99
 
0.7%
ValueCountFrequency (%)
n 1286
27.7%
O 743
16.0%
w 704
15.2%
e 621
13.4%
t 621
13.4%
R 582
12.6%
h 39
 
0.8%
r 39
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 14424
100.0%
ValueCountFrequency (%)
(unknown) 4635
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 4248
29.5%
O 3162
21.9%
w 3063
21.2%
e 1284
 
8.9%
t 1284
 
8.9%
R 1185
 
8.2%
h 99
 
0.7%
r 99
 
0.7%
ValueCountFrequency (%)
n 1286
27.7%
O 743
16.0%
w 704
15.2%
e 621
13.4%
t 621
13.4%
R 582
12.6%
h 39
 
0.8%
r 39
 
0.8%

Work
['Text', 'Text']

 Dataset ADataset B
Distinct33
Distinct (%)0.1%0.2%
Missing01
Missing (%)0.0%0.1%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:13.121134image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length1010
Median length710
Mean length7.8701030938.540358744
Min length77

Characters and Unicode

 Dataset ADataset B
Total characters34.35311.427
Distinct characters1010
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowWorkingNotWorking
2nd rowNotWorkingNotWorking
3rd rowNotWorkingWorking
4th rowWorkingNotWorking
5th rowWorkingWorking
ValueCountFrequency (%)
working 2939
67.3%
notworking 1266
29.0%
looking 160
 
3.7%
ValueCountFrequency (%)
notworking 687
51.3%
working 599
44.8%
looking 52
 
3.9%
2025-08-30T19:31:13.233624image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 5791
16.9%
k 4365
12.7%
i 4365
12.7%
n 4365
12.7%
g 4365
12.7%
W 4205
12.2%
r 4205
12.2%
N 1266
 
3.7%
t 1266
 
3.7%
L 160
 
0.5%
ValueCountFrequency (%)
o 2077
18.2%
k 1338
11.7%
i 1338
11.7%
n 1338
11.7%
g 1338
11.7%
W 1286
11.3%
r 1286
11.3%
N 687
 
6.0%
t 687
 
6.0%
L 52
 
0.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 34353
100.0%
ValueCountFrequency (%)
(unknown) 11427
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 5791
16.9%
k 4365
12.7%
i 4365
12.7%
n 4365
12.7%
g 4365
12.7%
W 4205
12.2%
r 4205
12.2%
N 1266
 
3.7%
t 1266
 
3.7%
L 160
 
0.5%
ValueCountFrequency (%)
o 2077
18.2%
k 1338
11.7%
i 1338
11.7%
n 1338
11.7%
g 1338
11.7%
W 1286
11.3%
r 1286
11.3%
N 687
 
6.0%
t 687
 
6.0%
L 52
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 34353
100.0%
ValueCountFrequency (%)
(unknown) 11427
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 5791
16.9%
k 4365
12.7%
i 4365
12.7%
n 4365
12.7%
g 4365
12.7%
W 4205
12.2%
r 4205
12.2%
N 1266
 
3.7%
t 1266
 
3.7%
L 160
 
0.5%
ValueCountFrequency (%)
o 2077
18.2%
k 1338
11.7%
i 1338
11.7%
n 1338
11.7%
g 1338
11.7%
W 1286
11.3%
r 1286
11.3%
N 687
 
6.0%
t 687
 
6.0%
L 52
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 34353
100.0%
ValueCountFrequency (%)
(unknown) 11427
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 5791
16.9%
k 4365
12.7%
i 4365
12.7%
n 4365
12.7%
g 4365
12.7%
W 4205
12.2%
r 4205
12.2%
N 1266
 
3.7%
t 1266
 
3.7%
L 160
 
0.5%
ValueCountFrequency (%)
o 2077
18.2%
k 1338
11.7%
i 1338
11.7%
n 1338
11.7%
g 1338
11.7%
W 1286
11.3%
r 1286
11.3%
N 687
 
6.0%
t 687
 
6.0%
L 52
 
0.5%

Weight
Real number (ℝ)

 Dataset ADataset B
Distinct798530
Distinct (%)18.4%39.9%
Missing3011
Missing (%)0.7%0.8%
Infinite00
Infinite (%)0.0%0.0%
Mean82.3399769380.91731928
 Dataset ADataset B
Minimum39.738.5
Maximum203223
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:13.294622image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum39.738.5
5-th percentile53.153.07
Q16766.375
median79.878
Q395.290.925
95-th percentile119.33117.86
Maximum203223
Range163.3184.5
Interquartile range (IQR)28.224.55

Descriptive statistics

 Dataset ADataset B
Standard deviation20.7843564821.5063777
Coefficient of variation (CV)0.25242120850.2657821328
Kurtosis1.1562086354.380167832
Mean82.3399769380.91731928
Median Absolute Deviation (MAD)13.712.4
Skewness0.80235649271.386329681
Sum356943.8107458.2
Variance431.9894744462.5242816
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:13.367647image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
77.3 23
 
0.5%
90.3 23
 
0.5%
73.8 21
 
0.5%
61.7 21
 
0.5%
65.3 20
 
0.5%
84.8 20
 
0.5%
91.1 18
 
0.4%
61.1 18
 
0.4%
91.7 17
 
0.4%
63.3 17
 
0.4%
Other values (788) 4137
94.8%
(Missing) 30
 
0.7%
ValueCountFrequency (%)
64.5 11
 
0.8%
87.3 11
 
0.8%
78.1 9
 
0.7%
79.2 9
 
0.7%
83.2 9
 
0.7%
65.1 9
 
0.7%
99.3 8
 
0.6%
76 8
 
0.6%
70.8 8
 
0.6%
140.1 8
 
0.6%
Other values (520) 1238
92.5%
(Missing) 11
 
0.8%
ValueCountFrequency (%)
39.7 1
 
< 0.1%
40.2 1
 
< 0.1%
40.8 1
 
< 0.1%
41.1 3
0.1%
41.5 1
 
< 0.1%
ValueCountFrequency (%)
38.5 1
 
0.1%
38.7 1
 
0.1%
39.3 2
0.1%
40.7 3
0.2%
40.8 1
 
0.1%
ValueCountFrequency (%)
38.5 1
 
< 0.1%
38.7 1
 
< 0.1%
39.3 2
< 0.1%
40.7 3
0.1%
40.8 1
 
< 0.1%
ValueCountFrequency (%)
39.7 1
 
0.1%
40.2 1
 
0.1%
40.8 1
 
0.1%
41.1 3
0.2%
41.5 1
 
0.1%

Length
[None, None]

 Dataset ADataset B
Missing43651339
Missing (%)100.0%100.0%
Memory size197.2 KiB53.2 KiB

HeadCirc
[None, None]

 Dataset ADataset B
Missing43651339
Missing (%)100.0%100.0%
Memory size197.2 KiB53.2 KiB

Height
Real number (ℝ)

 Dataset ADataset B
Distinct458370
Distinct (%)10.6%27.9%
Missing2813
Missing (%)0.6%1.0%
Infinite00
Infinite (%)0.0%0.0%
Mean169.9250865166.0863499
 Dataset ADataset B
Minimum140134.5
Maximum199.9200.4
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:13.440373image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum140134.5
5-th percentile154.5149.5
Q1162.8157.9
median169.6166.2
Q3176.8173.6
95-th percentile186.6183.4
Maximum199.9200.4
Range59.965.9
Interquartile range (IQR)1415.7

Descriptive statistics

 Dataset ADataset B
Standard deviation9.77816858410.50775034
Coefficient of variation (CV)0.057543996520.06326679074
Kurtosis-0.378252253-0.5062421774
Mean169.9250865166.0863499
Median Absolute Deviation (MAD)77.8
Skewness0.12326916080.06452696498
Sum736965.1220230.5
Variance95.61258086110.4128173
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:13.511827image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
171.9 29
 
0.7%
173.8 28
 
0.6%
163.6 28
 
0.6%
160.7 27
 
0.6%
160.6 27
 
0.6%
176.6 27
 
0.6%
177.1 26
 
0.6%
166.7 26
 
0.6%
175.3 26
 
0.6%
173.2 25
 
0.6%
Other values (448) 4068
93.2%
(Missing) 28
 
0.6%
ValueCountFrequency (%)
173.2 12
 
0.9%
178.8 11
 
0.8%
157.9 11
 
0.8%
173.1 11
 
0.8%
172.6 11
 
0.8%
168 11
 
0.8%
156.7 10
 
0.7%
163.1 10
 
0.7%
173 10
 
0.7%
162.8 10
 
0.7%
Other values (360) 1219
91.0%
(Missing) 13
 
1.0%
ValueCountFrequency (%)
140 1
< 0.1%
141.2 2
< 0.1%
143.1 1
< 0.1%
143.3 1
< 0.1%
143.8 1
< 0.1%
ValueCountFrequency (%)
134.5 1
0.1%
139.8 1
0.1%
141.3 2
0.1%
143.4 1
0.1%
143.6 1
0.1%
ValueCountFrequency (%)
134.5 1
< 0.1%
139.8 1
< 0.1%
141.3 2
< 0.1%
143.4 1
< 0.1%
143.6 1
< 0.1%
ValueCountFrequency (%)
140 1
0.1%
141.2 2
0.1%
143.1 1
0.1%
143.3 1
0.1%
143.8 1
0.1%

BMI
Real number (ℝ)

 Dataset ADataset B
Distinct1198636
Distinct (%)27.6%48.0%
Missing3214
Missing (%)0.7%1.0%
Infinite00
Infinite (%)0.0%0.0%
Mean28.4131825529.25124528
 Dataset ADataset B
Minimum15.0215.7
Maximum68.6367.83
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:13.586243image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum15.0215.7
5-th percentile2020.3
Q123.824.8
median27.428.1
Q331.832.5
95-th percentile40.6841.862
Maximum68.6367.83
Range53.6152.13
Interquartile range (IQR)87.7

Descriptive statistics

 Dataset ADataset B
Standard deviation6.4085226226.836315771
Coefficient of variation (CV)0.22554751160.2337102474
Kurtosis2.1926248263.536839754
Mean28.4131825529.25124528
Median Absolute Deviation (MAD)43.78
Skewness1.1041892551.315612394
Sum123114.3238757.9
Variance41.0691621946.73521333
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:13.660782image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25.6 40
 
0.9%
22.6 35
 
0.8%
23.2 31
 
0.7%
26.4 28
 
0.6%
24.5 27
 
0.6%
30.5 27
 
0.6%
25.2 26
 
0.6%
22.5 25
 
0.6%
29.7 25
 
0.6%
28 24
 
0.5%
Other values (1188) 4045
92.7%
(Missing) 32
 
0.7%
ValueCountFrequency (%)
28.6 15
 
1.1%
27.8 12
 
0.9%
32.5 12
 
0.9%
26.9 11
 
0.8%
24.7 10
 
0.7%
26.6 10
 
0.7%
27.2 10
 
0.7%
29.4 9
 
0.7%
28.9 9
 
0.7%
23.5 8
 
0.6%
Other values (626) 1219
91.0%
(Missing) 14
 
1.0%
ValueCountFrequency (%)
15.02 1
 
< 0.1%
15.98 1
 
< 0.1%
16.51 1
 
< 0.1%
16.6 3
0.1%
16.73 3
0.1%
ValueCountFrequency (%)
15.7 1
 
0.1%
15.98 3
0.2%
16.03 1
 
0.1%
16.38 1
 
0.1%
17.7 2
0.1%
ValueCountFrequency (%)
15.7 1
 
< 0.1%
15.98 3
0.1%
16.03 1
 
< 0.1%
16.38 1
 
< 0.1%
17.7 2
< 0.1%
ValueCountFrequency (%)
15.02 1
 
0.1%
15.98 1
 
0.1%
16.51 1
 
0.1%
16.6 3
0.2%
16.73 3
0.2%

BMICatUnder20yrs
[None, None]

 Dataset ADataset B
Missing43651339
Missing (%)100.0%100.0%
Memory size197.2 KiB53.2 KiB

BMI_WHO
['Text', 'Text']

 Dataset ADataset B
Distinct44
Distinct (%)0.1%0.3%
Missing5718
Missing (%)1.3%1.3%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:13.752716image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length1212
Median length1212
Mean length10.9066852410.83043149
Min length99

Characters and Unicode

 Dataset ADataset B
Total characters46.98614.307
Distinct characters1616
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st row18.5_to_24.925.0_to_29.9
2nd row18.5_to_24.925.0_to_29.9
3rd row30.0_plus30.0_plus
4th row30.0_plus25.0_to_29.9
5th row30.0_plus25.0_to_29.9
ValueCountFrequency (%)
30.0_plus 1502
34.9%
25.0_to_29.9 1397
32.4%
18.5_to_24.9 1341
31.1%
12.0_18.5 68
 
1.6%
ValueCountFrequency (%)
30.0_plus 493
37.3%
25.0_to_29.9 478
36.2%
18.5_to_24.9 328
24.8%
12.0_18.5 22
 
1.7%
2025-08-30T19:31:13.871373image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 7114
15.1%
_ 7046
15.0%
0 4469
9.5%
2 4203
8.9%
9 4135
8.8%
5 2806
 
6.0%
t 2738
 
5.8%
o 2738
 
5.8%
3 1502
 
3.2%
p 1502
 
3.2%
Other values (6) 8733
18.6%
ValueCountFrequency (%)
. 2149
15.0%
_ 2127
14.9%
0 1486
10.4%
2 1306
9.1%
9 1284
9.0%
5 828
 
5.8%
t 806
 
5.6%
o 806
 
5.6%
3 493
 
3.4%
p 493
 
3.4%
Other values (6) 2529
17.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 46986
100.0%
ValueCountFrequency (%)
(unknown) 14307
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
. 7114
15.1%
_ 7046
15.0%
0 4469
9.5%
2 4203
8.9%
9 4135
8.8%
5 2806
 
6.0%
t 2738
 
5.8%
o 2738
 
5.8%
3 1502
 
3.2%
p 1502
 
3.2%
Other values (6) 8733
18.6%
ValueCountFrequency (%)
. 2149
15.0%
_ 2127
14.9%
0 1486
10.4%
2 1306
9.1%
9 1284
9.0%
5 828
 
5.8%
t 806
 
5.6%
o 806
 
5.6%
3 493
 
3.4%
p 493
 
3.4%
Other values (6) 2529
17.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 46986
100.0%
ValueCountFrequency (%)
(unknown) 14307
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
. 7114
15.1%
_ 7046
15.0%
0 4469
9.5%
2 4203
8.9%
9 4135
8.8%
5 2806
 
6.0%
t 2738
 
5.8%
o 2738
 
5.8%
3 1502
 
3.2%
p 1502
 
3.2%
Other values (6) 8733
18.6%
ValueCountFrequency (%)
. 2149
15.0%
_ 2127
14.9%
0 1486
10.4%
2 1306
9.1%
9 1284
9.0%
5 828
 
5.8%
t 806
 
5.6%
o 806
 
5.6%
3 493
 
3.4%
p 493
 
3.4%
Other values (6) 2529
17.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 46986
100.0%
ValueCountFrequency (%)
(unknown) 14307
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
. 7114
15.1%
_ 7046
15.0%
0 4469
9.5%
2 4203
8.9%
9 4135
8.8%
5 2806
 
6.0%
t 2738
 
5.8%
o 2738
 
5.8%
3 1502
 
3.2%
p 1502
 
3.2%
Other values (6) 8733
18.6%
ValueCountFrequency (%)
. 2149
15.0%
_ 2127
14.9%
0 1486
10.4%
2 1306
9.1%
9 1284
9.0%
5 828
 
5.8%
t 806
 
5.6%
o 806
 
5.6%
3 493
 
3.4%
p 493
 
3.4%
Other values (6) 2529
17.7%

Pulse
Real number (ℝ)

 Dataset ADataset B
Distinct4139
Distinct (%)1.0%3.0%
Missing14748
Missing (%)3.4%3.6%
Infinite00
Infinite (%)0.0%0.0%
Mean72.1602655373.08597986
 Dataset ADataset B
Minimum4042
Maximum136122
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:13.924423image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum4042
5-th percentile5454
Q16464
median7272
Q38082
95-th percentile9296
Maximum136122
Range9680
Interquartile range (IQR)1618

Descriptive statistics

 Dataset ADataset B
Standard deviation11.546194112.72638278
Coefficient of variation (CV)0.16000764430.1741289206
Kurtosis0.57119852090.329764511
Mean72.1602655373.08597986
Median Absolute Deviation (MAD)88
Skewness0.40294953320.5474870434
Sum30437294354
Variance133.3145982161.9608188
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:13.989048image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
70 298
 
6.8%
68 285
 
6.5%
72 284
 
6.5%
74 277
 
6.3%
76 268
 
6.1%
64 265
 
6.1%
66 257
 
5.9%
78 254
 
5.8%
60 227
 
5.2%
80 203
 
4.7%
Other values (31) 1600
36.7%
ValueCountFrequency (%)
68 97
 
7.2%
70 87
 
6.5%
66 85
 
6.3%
72 82
 
6.1%
64 82
 
6.1%
80 73
 
5.5%
60 70
 
5.2%
82 69
 
5.2%
76 65
 
4.9%
62 59
 
4.4%
Other values (29) 522
39.0%
ValueCountFrequency (%)
40 9
0.2%
42 3
 
0.1%
44 8
0.2%
46 3
 
0.1%
48 15
0.3%
ValueCountFrequency (%)
42 1
 
0.1%
44 3
 
0.2%
46 2
 
0.1%
48 12
0.9%
50 12
0.9%
ValueCountFrequency (%)
42 1
 
< 0.1%
44 3
 
0.1%
46 2
 
< 0.1%
48 12
0.3%
50 12
0.3%
ValueCountFrequency (%)
40 9
0.7%
42 3
 
0.2%
44 8
0.6%
46 3
 
0.2%
48 15
1.1%

BPSysAve
Real number (ℝ)

 Dataset ADataset B
Distinct11197
Distinct (%)2.6%7.5%
Missing15450
Missing (%)3.5%3.7%
Infinite00
Infinite (%)0.0%0.0%
Mean119.7394918123.5120248
 Dataset ADataset B
Minimum8278
Maximum226221
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:14.058861image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum8278
5-th percentile97100
Q1108111
median118121
Q3128132
95-th percentile148158.6
Maximum226221
Range144143
Interquartile range (IQR)2021

Descriptive statistics

 Dataset ADataset B
Standard deviation16.5872430618.23300842
Coefficient of variation (CV)0.13852775570.1476213222
Kurtosis2.642043982.361902326
Mean119.7394918123.5120248
Median Absolute Deviation (MAD)1010
Skewness1.0547655781.106364583
Sum504223159207
Variance275.1366324332.442596
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:14.133317image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
118 152
 
3.5%
114 134
 
3.1%
119 129
 
3.0%
112 127
 
2.9%
115 118
 
2.7%
110 115
 
2.6%
116 110
 
2.5%
111 108
 
2.5%
121 107
 
2.5%
125 107
 
2.5%
Other values (101) 3004
68.8%
(Missing) 154
 
3.5%
ValueCountFrequency (%)
114 59
 
4.4%
124 46
 
3.4%
115 41
 
3.1%
116 39
 
2.9%
125 38
 
2.8%
122 34
 
2.5%
112 33
 
2.5%
111 33
 
2.5%
130 32
 
2.4%
110 32
 
2.4%
Other values (87) 902
67.4%
(Missing) 50
 
3.7%
ValueCountFrequency (%)
82 3
 
0.1%
83 1
 
< 0.1%
84 8
0.2%
85 7
0.2%
86 10
0.2%
ValueCountFrequency (%)
78 1
 
0.1%
80 1
 
0.1%
81 3
0.2%
85 1
 
0.1%
86 1
 
0.1%
ValueCountFrequency (%)
78 1
 
< 0.1%
80 1
 
< 0.1%
81 3
0.1%
85 1
 
< 0.1%
86 1
 
< 0.1%
ValueCountFrequency (%)
82 3
 
0.2%
83 1
 
0.1%
84 8
0.6%
85 7
0.5%
86 10
0.7%

BPDiaAve
Real number (ℝ)

 Dataset ADataset B
Distinct8677
Distinct (%)2.0%6.0%
Missing15450
Missing (%)3.5%3.7%
Infinite00
Infinite (%)0.0%0.0%
Mean70.6012823668.68657874
 Dataset ADataset B
Minimum00
Maximum116116
Zeros89
Zeros (%)0.2%0.7%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:14.205607image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile5248.4
Q16461
median7169
Q37877
95-th percentile8888
Maximum116116
Range116116
Interquartile range (IQR)1416

Descriptive statistics

 Dataset ADataset B
Standard deviation11.7521267413.62648869
Coefficient of variation (CV)0.16645769520.1983864816
Kurtosis3.184007894.295346223
Mean70.6012823668.68657874
Median Absolute Deviation (MAD)78
Skewness-0.5792421573-0.9348834165
Sum29730288537
Variance138.1124829185.681194
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:14.277078image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
69 182
 
4.2%
72 177
 
4.1%
70 172
 
3.9%
67 164
 
3.8%
71 163
 
3.7%
68 159
 
3.6%
75 154
 
3.5%
77 151
 
3.5%
74 149
 
3.4%
66 144
 
3.3%
Other values (76) 2596
59.5%
(Missing) 154
 
3.5%
ValueCountFrequency (%)
69 63
 
4.7%
68 57
 
4.3%
73 49
 
3.7%
65 48
 
3.6%
72 46
 
3.4%
70 43
 
3.2%
66 39
 
2.9%
75 39
 
2.9%
64 38
 
2.8%
74 38
 
2.8%
Other values (67) 829
61.9%
(Missing) 50
 
3.7%
ValueCountFrequency (%)
0 8
0.2%
19 1
 
< 0.1%
21 3
 
0.1%
22 1
 
< 0.1%
25 1
 
< 0.1%
ValueCountFrequency (%)
0 9
0.7%
15 1
 
0.1%
21 2
 
0.1%
24 2
 
0.1%
29 2
 
0.1%
ValueCountFrequency (%)
0 9
0.2%
15 1
 
< 0.1%
21 2
 
< 0.1%
24 2
 
< 0.1%
29 2
 
< 0.1%
ValueCountFrequency (%)
0 8
0.6%
19 1
 
0.1%
21 3
 
0.2%
22 1
 
0.1%
25 1
 
0.1%

BPSys1
Real number (ℝ)

 Dataset ADataset B
Distinct6056
Distinct (%)1.5%4.5%
Missing30296
Missing (%)6.9%7.2%
Infinite00
Infinite (%)0.0%0.0%
Mean120.5990647125.1069992
 Dataset ADataset B
Minimum8074
Maximum232224
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:14.346980image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum8074
5-th percentile98102
Q1110112
median118122
Q3130134
95-th percentile150159.8
Maximum232224
Range152150
Interquartile range (IQR)2022

Descriptive statistics

 Dataset ADataset B
Standard deviation16.5938676718.67073733
Coefficient of variation (CV)0.13759532640.1492381517
Kurtosis2.2403754312.142772495
Mean120.5990647125.1069992
Median Absolute Deviation (MAD)1010
Skewness0.97602397321.058027039
Sum489994155508
Variance275.3564442348.5964325
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:14.419053image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
114 271
 
6.2%
116 244
 
5.6%
124 235
 
5.4%
118 233
 
5.3%
106 202
 
4.6%
112 195
 
4.5%
126 185
 
4.2%
122 184
 
4.2%
110 175
 
4.0%
120 168
 
3.8%
Other values (50) 1971
45.2%
(Missing) 302
 
6.9%
ValueCountFrequency (%)
124 79
 
5.9%
114 74
 
5.5%
118 64
 
4.8%
112 64
 
4.8%
110 59
 
4.4%
120 58
 
4.3%
128 58
 
4.3%
116 57
 
4.3%
130 47
 
3.5%
122 46
 
3.4%
Other values (46) 637
47.6%
(Missing) 96
 
7.2%
ValueCountFrequency (%)
80 2
 
< 0.1%
82 1
 
< 0.1%
84 10
0.2%
86 3
 
0.1%
88 10
0.2%
ValueCountFrequency (%)
74 1
 
0.1%
78 1
 
0.1%
82 2
 
0.1%
92 10
0.7%
94 8
0.6%
ValueCountFrequency (%)
74 1
 
< 0.1%
78 1
 
< 0.1%
82 2
 
< 0.1%
92 10
0.2%
94 8
0.2%
ValueCountFrequency (%)
80 2
 
0.1%
82 1
 
0.1%
84 10
0.7%
86 3
 
0.2%
88 10
0.7%

BPDia1
Real number (ℝ)

 Dataset ADataset B
Distinct4639
Distinct (%)1.1%3.1%
Missing30296
Missing (%)6.9%7.2%
Infinite00
Infinite (%)0.0%0.0%
Mean71.1877922769.53660499
 Dataset ADataset B
Minimum00
Maximum118110
Zeros77
Zeros (%)0.2%0.5%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:14.487010image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile5250
Q16462
median7270
Q37878
95-th percentile9088
Maximum118110
Range118110
Interquartile range (IQR)1416

Descriptive statistics

 Dataset ADataset B
Standard deviation11.5348861213.01049633
Coefficient of variation (CV)0.16203460950.1871028408
Kurtosis2.7885029444.578255596
Mean71.1877922769.53660499
Median Absolute Deviation (MAD)68
Skewness-0.3979440575-0.8992679765
Sum28923686434
Variance133.0535979169.2730149
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:14.557611image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=46)
ValueCountFrequency (%)
74 325
 
7.4%
72 308
 
7.1%
70 302
 
6.9%
68 296
 
6.8%
66 287
 
6.6%
76 259
 
5.9%
78 257
 
5.9%
64 255
 
5.8%
62 218
 
5.0%
80 195
 
4.5%
Other values (36) 1361
31.2%
(Missing) 302
 
6.9%
ValueCountFrequency (%)
66 100
 
7.5%
68 97
 
7.2%
62 81
 
6.0%
64 80
 
6.0%
76 78
 
5.8%
72 76
 
5.7%
74 73
 
5.5%
70 73
 
5.5%
78 60
 
4.5%
82 60
 
4.5%
Other values (29) 465
34.7%
(Missing) 96
 
7.2%
ValueCountFrequency (%)
0 7
0.2%
20 1
 
< 0.1%
22 1
 
< 0.1%
28 4
0.1%
30 1
 
< 0.1%
ValueCountFrequency (%)
0 7
0.5%
10 2
 
0.1%
24 1
 
0.1%
34 3
0.2%
36 1
 
0.1%
ValueCountFrequency (%)
0 7
0.2%
10 2
 
< 0.1%
24 1
 
< 0.1%
34 3
0.1%
36 1
 
< 0.1%
ValueCountFrequency (%)
0 7
0.5%
20 1
 
0.1%
22 1
 
0.1%
28 4
0.3%
30 1
 
0.1%

BPSys2
Real number (ℝ)

 Dataset ADataset B
Distinct6057
Distinct (%)1.5%4.5%
Missing24485
Missing (%)5.6%6.3%
Infinite00
Infinite (%)0.0%0.0%
Mean120.0485319123.8341308
 Dataset ADataset B
Minimum8278
Maximum226226
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:14.627041image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum8278
5-th percentile96100
Q1108112
median118122
Q3128134
95-th percentile148160
Maximum226226
Range144148
Interquartile range (IQR)2022

Descriptive statistics

 Dataset ADataset B
Standard deviation16.8145098418.65205194
Coefficient of variation (CV)0.14006426880.150621253
Kurtosis2.7386770522.595075861
Mean120.0485319123.8341308
Median Absolute Deviation (MAD)1010
Skewness1.0611280711.141391899
Sum494720155288
Variance282.7277412347.8990417
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:14.698682image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
118 267
 
6.1%
116 246
 
5.6%
114 223
 
5.1%
124 213
 
4.9%
112 200
 
4.6%
110 198
 
4.5%
108 182
 
4.2%
106 181
 
4.1%
120 179
 
4.1%
126 178
 
4.1%
Other values (50) 2054
47.1%
(Missing) 244
 
5.6%
ValueCountFrequency (%)
114 107
 
8.0%
124 76
 
5.7%
118 65
 
4.9%
116 64
 
4.8%
126 59
 
4.4%
112 57
 
4.3%
108 55
 
4.1%
132 55
 
4.1%
110 52
 
3.9%
122 49
 
3.7%
Other values (47) 615
45.9%
(Missing) 85
 
6.3%
ValueCountFrequency (%)
82 1
 
< 0.1%
84 5
 
0.1%
86 21
0.5%
88 14
0.3%
90 20
0.5%
ValueCountFrequency (%)
78 2
0.1%
82 3
0.2%
84 1
 
0.1%
86 2
0.1%
88 1
 
0.1%
ValueCountFrequency (%)
78 2
< 0.1%
82 3
0.1%
84 1
 
< 0.1%
86 2
< 0.1%
88 1
 
< 0.1%
ValueCountFrequency (%)
82 1
 
0.1%
84 5
 
0.4%
86 21
1.6%
88 14
1.0%
90 20
1.5%

BPDia2
Real number (ℝ)

 Dataset ADataset B
Distinct4643
Distinct (%)1.1%3.4%
Missing24485
Missing (%)5.6%6.3%
Infinite00
Infinite (%)0.0%0.0%
Mean70.5935452668.97129187
 Dataset ADataset B
Minimum00
Maximum118116
Zeros139
Zeros (%)0.3%0.7%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:14.767435image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile5250
Q16462
median7270
Q37878
95-th percentile8888
Maximum118116
Range118116
Interquartile range (IQR)1416

Descriptive statistics

 Dataset ADataset B
Standard deviation11.8964809613.61601349
Coefficient of variation (CV)0.16852080340.1974156655
Kurtosis3.8627307824.55975518
Mean70.5935452668.97129187
Median Absolute Deviation (MAD)68
Skewness-0.6833681222-0.942156728
Sum29091686490
Variance141.5262593185.3958232
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:14.835732image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=46)
ValueCountFrequency (%)
72 339
 
7.8%
74 327
 
7.5%
70 321
 
7.4%
66 295
 
6.8%
68 280
 
6.4%
78 267
 
6.1%
64 263
 
6.0%
76 253
 
5.8%
62 185
 
4.2%
80 171
 
3.9%
Other values (36) 1420
32.5%
(Missing) 244
 
5.6%
ValueCountFrequency (%)
70 96
 
7.2%
72 90
 
6.7%
66 86
 
6.4%
62 82
 
6.1%
74 80
 
6.0%
68 76
 
5.7%
64 71
 
5.3%
78 68
 
5.1%
76 63
 
4.7%
82 60
 
4.5%
Other values (33) 482
36.0%
(Missing) 85
 
6.3%
ValueCountFrequency (%)
0 13
0.3%
24 1
 
< 0.1%
26 1
 
< 0.1%
28 1
 
< 0.1%
30 6
0.1%
ValueCountFrequency (%)
0 9
0.7%
10 2
 
0.1%
22 1
 
0.1%
30 1
 
0.1%
34 1
 
0.1%
ValueCountFrequency (%)
0 9
0.2%
10 2
 
< 0.1%
22 1
 
< 0.1%
30 1
 
< 0.1%
34 1
 
< 0.1%
ValueCountFrequency (%)
0 13
1.0%
24 1
 
0.1%
26 1
 
0.1%
28 1
 
0.1%
30 6
0.4%

BPSys3
Real number (ℝ)

 Dataset ADataset B
Distinct6056
Distinct (%)1.4%4.4%
Missing22379
Missing (%)5.1%5.9%
Infinite00
Infinite (%)0.0%0.0%
Mean119.3756639123.2984127
 Dataset ADataset B
Minimum7680
Maximum226216
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:14.905143image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum7680
5-th percentile96100
Q1108110
median118120
Q3128132
95-th percentile148158
Maximum226216
Range150136
Interquartile range (IQR)2022

Descriptive statistics

 Dataset ADataset B
Standard deviation16.6094204818.24734733
Coefficient of variation (CV)0.13913573280.1479933677
Kurtosis2.4564301062.04840905
Mean119.3756639123.2984127
Median Absolute Deviation (MAD)1010
Skewness0.99391813461.067692717
Sum494454155356
Variance275.8728488332.9656845
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:14.977518image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
114 283
 
6.5%
116 236
 
5.4%
112 217
 
5.0%
118 213
 
4.9%
104 212
 
4.9%
124 202
 
4.6%
126 197
 
4.5%
120 197
 
4.5%
106 192
 
4.4%
122 177
 
4.1%
Other values (50) 2016
46.2%
(Missing) 223
 
5.1%
ValueCountFrequency (%)
114 82
 
6.1%
106 72
 
5.4%
116 72
 
5.4%
126 68
 
5.1%
120 59
 
4.4%
118 58
 
4.3%
108 58
 
4.3%
124 55
 
4.1%
110 54
 
4.0%
122 52
 
3.9%
Other values (46) 630
47.1%
(Missing) 79
 
5.9%
ValueCountFrequency (%)
76 1
 
< 0.1%
80 2
 
< 0.1%
82 7
0.2%
84 14
0.3%
86 12
0.3%
ValueCountFrequency (%)
80 3
0.2%
82 1
 
0.1%
84 2
0.1%
88 4
0.3%
90 3
0.2%
ValueCountFrequency (%)
80 3
0.1%
82 1
 
< 0.1%
84 2
< 0.1%
88 4
0.1%
90 3
0.1%
ValueCountFrequency (%)
76 1
 
0.1%
80 2
 
0.1%
82 7
0.5%
84 14
1.0%
86 12
0.9%

BPDia3
Real number (ℝ)

 Dataset ADataset B
Distinct4540
Distinct (%)1.1%3.2%
Missing22379
Missing (%)5.1%5.9%
Infinite00
Infinite (%)0.0%0.0%
Mean70.4876871168.32222222
 Dataset ADataset B
Minimum00
Maximum114116
Zeros1714
Zeros (%)0.4%1.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:15.046241image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile5246
Q16462
median7068
Q37878
95-th percentile8888
Maximum114116
Range114116
Interquartile range (IQR)1416

Descriptive statistics

 Dataset ADataset B
Standard deviation12.1512530814.27921883
Coefficient of variation (CV)0.17238830750.2089981614
Kurtosis4.3924927124.770120947
Mean70.4876871168.32222222
Median Absolute Deviation (MAD)88
Skewness-0.7883435228-1.087355986
Sum29196086086
Variance147.6529515203.8960904
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:15.112994image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
68 324
 
7.4%
72 303
 
6.9%
78 300
 
6.9%
66 293
 
6.7%
76 291
 
6.7%
70 280
 
6.4%
74 278
 
6.4%
64 253
 
5.8%
62 214
 
4.9%
80 184
 
4.2%
Other values (35) 1422
32.6%
(Missing) 223
 
5.1%
ValueCountFrequency (%)
64 93
 
6.9%
68 90
 
6.7%
72 83
 
6.2%
70 82
 
6.1%
76 72
 
5.4%
66 71
 
5.3%
62 68
 
5.1%
74 67
 
5.0%
60 65
 
4.9%
78 64
 
4.8%
Other values (30) 505
37.7%
(Missing) 79
 
5.9%
ValueCountFrequency (%)
0 17
0.4%
28 1
 
< 0.1%
30 1
 
< 0.1%
32 5
 
0.1%
34 2
 
< 0.1%
ValueCountFrequency (%)
0 14
1.0%
32 2
 
0.1%
34 2
 
0.1%
36 7
0.5%
38 6
0.4%
ValueCountFrequency (%)
0 14
0.3%
32 2
 
< 0.1%
34 2
 
< 0.1%
36 7
0.2%
38 6
0.1%
ValueCountFrequency (%)
0 17
1.3%
28 1
 
0.1%
30 1
 
0.1%
32 5
 
0.4%
34 2
 
0.1%

Testosterone
Real number (ℝ)

 Dataset ADataset B
Distinct1171370
Distinct (%)54.3%68.0%
Missing2208795
Missing (%)50.6%59.4%
Infinite00
Infinite (%)0.0%0.0%
Mean212.6698563225.5515625
 Dataset ADataset B
Minimum0.650.25
Maximum1795.61244.73
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:15.182543image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum0.650.25
5-th percentile9.0687.9435
Q119.7320.62
median59.02196.04
Q3386.91367.815
95-th percentile636.6600
Maximum1795.61244.73
Range1794.951244.48
Interquartile range (IQR)367.18347.195

Descriptive statistics

 Dataset ADataset B
Standard deviation231.7978659220.4822143
Coefficient of variation (CV)1.089942270.9775246593
Kurtosis0.9779503110.4742514126
Mean212.6698563225.5515625
Median Absolute Deviation (MAD)51.48175.42
Skewness1.0426751940.8821588005
Sum458728.88122700.05
Variance53730.2506348612.40681
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:15.255081image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25.54 8
 
0.2%
23.02 8
 
0.2%
579.1 7
 
0.2%
246.82 7
 
0.2%
367.35 7
 
0.2%
18.25 7
 
0.2%
350 6
 
0.1%
15.96 6
 
0.1%
23.24 6
 
0.1%
15.22 6
 
0.1%
Other values (1161) 2089
47.9%
(Missing) 2208
50.6%
ValueCountFrequency (%)
174.1 6
 
0.4%
12.26 5
 
0.4%
4.55 5
 
0.4%
15.16 4
 
0.3%
17.3 4
 
0.3%
20.62 4
 
0.3%
683.12 4
 
0.3%
8.67 4
 
0.3%
311.09 4
 
0.3%
39.6 4
 
0.3%
Other values (360) 500
37.3%
(Missing) 795
59.4%
ValueCountFrequency (%)
0.65 2
< 0.1%
1.79 1
< 0.1%
2.35 1
< 0.1%
2.72 1
< 0.1%
3.39 1
< 0.1%
ValueCountFrequency (%)
0.25 1
0.1%
1.52 1
0.1%
1.98 1
0.1%
3.22 2
0.1%
3.32 1
0.1%
ValueCountFrequency (%)
0.25 1
< 0.1%
1.52 1
< 0.1%
1.98 1
< 0.1%
3.22 2
< 0.1%
3.32 1
< 0.1%
ValueCountFrequency (%)
0.65 2
0.1%
1.79 1
0.1%
2.35 1
0.1%
2.72 1
0.1%
3.39 1
0.1%

DirectChol
Real number (ℝ)

 Dataset ADataset B
Distinct9880
Distinct (%)2.4%6.5%
Missing20999
Missing (%)4.8%7.4%
Infinite00
Infinite (%)0.0%0.0%
Mean1.4144802691.274145161
 Dataset ADataset B
Minimum0.520.39
Maximum3.834.03
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:15.326938image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum0.520.39
5-th percentile0.850.78
Q11.111.01
median1.341.22
Q31.661.47
95-th percentile2.221.97
Maximum3.834.03
Range3.313.64
Interquartile range (IQR)0.550.46

Descriptive statistics

 Dataset ADataset B
Standard deviation0.43092450290.3811940099
Coefficient of variation (CV)0.30465218370.2991762803
Kurtosis1.6664440283.855635674
Mean1.4144802691.274145161
Median Absolute Deviation (MAD)0.280.24
Skewness0.9889752581.206519196
Sum5878.581579.94
Variance0.18569592720.1453088732
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:15.401084image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.24 146
 
3.3%
1.19 124
 
2.8%
1.29 119
 
2.7%
1.11 117
 
2.7%
1.32 111
 
2.5%
1.06 109
 
2.5%
1.14 109
 
2.5%
1.37 108
 
2.5%
1.4 105
 
2.4%
1.5 104
 
2.4%
Other values (88) 3004
68.8%
(Missing) 209
 
4.8%
ValueCountFrequency (%)
0.98 53
 
4.0%
1.32 50
 
3.7%
1.11 44
 
3.3%
1.19 42
 
3.1%
0.96 40
 
3.0%
1.16 40
 
3.0%
1.06 40
 
3.0%
1.22 37
 
2.8%
1.29 36
 
2.7%
1.24 35
 
2.6%
Other values (70) 823
61.5%
(Missing) 99
 
7.4%
ValueCountFrequency (%)
0.52 2
< 0.1%
0.54 3
0.1%
0.57 3
0.1%
0.59 4
0.1%
0.62 2
< 0.1%
ValueCountFrequency (%)
0.39 3
0.2%
0.41 1
 
0.1%
0.47 1
 
0.1%
0.54 1
 
0.1%
0.57 4
0.3%
ValueCountFrequency (%)
0.39 3
0.1%
0.41 1
 
< 0.1%
0.47 1
 
< 0.1%
0.54 1
 
< 0.1%
0.57 4
0.1%
ValueCountFrequency (%)
0.52 2
0.1%
0.54 3
0.2%
0.57 3
0.2%
0.59 4
0.3%
0.62 2
0.1%

TotChol
Real number (ℝ)

 Dataset ADataset B
Distinct219187
Distinct (%)5.3%15.1%
Missing20999
Missing (%)4.8%7.4%
Infinite00
Infinite (%)0.0%0.0%
Mean5.0805101065.004508065
 Dataset ADataset B
Minimum1.532.17
Maximum13.6510.29
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:15.474473image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum1.532.17
5-th percentile3.543.34
Q14.324.29
median5.024.94
Q35.695.645
95-th percentile6.836.78
Maximum13.6510.29
Range12.128.12
Interquartile range (IQR)1.371.355

Descriptive statistics

 Dataset ADataset B
Standard deviation1.0400296351.057549858
Coefficient of variation (CV)0.20470968720.2113194433
Kurtosis1.9820360380.740846084
Mean5.0805101065.004508065
Median Absolute Deviation (MAD)0.70.685
Skewness0.747489160.415698943
Sum21114.66205.59
Variance1.0816616411.118411703
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:15.547799image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.84 63
 
1.4%
4.78 62
 
1.4%
4.81 55
 
1.3%
5.02 54
 
1.2%
5.2 53
 
1.2%
5.15 51
 
1.2%
5.04 50
 
1.1%
4.24 49
 
1.1%
4.86 49
 
1.1%
5.43 48
 
1.1%
Other values (209) 3622
83.0%
(Missing) 209
 
4.8%
ValueCountFrequency (%)
4.29 25
 
1.9%
5.51 21
 
1.6%
3.88 19
 
1.4%
4.09 19
 
1.4%
4.78 18
 
1.3%
4.91 17
 
1.3%
4.19 17
 
1.3%
4.97 16
 
1.2%
4.5 16
 
1.2%
5.09 16
 
1.2%
Other values (177) 1056
78.9%
(Missing) 99
 
7.4%
ValueCountFrequency (%)
1.53 2
< 0.1%
2.38 2
< 0.1%
2.74 1
 
< 0.1%
2.77 1
 
< 0.1%
2.79 3
0.1%
ValueCountFrequency (%)
2.17 1
0.1%
2.35 1
0.1%
2.4 1
0.1%
2.43 1
0.1%
2.59 1
0.1%
ValueCountFrequency (%)
2.17 1
< 0.1%
2.35 1
< 0.1%
2.4 1
< 0.1%
2.43 1
< 0.1%
2.59 1
< 0.1%
ValueCountFrequency (%)
1.53 2
0.1%
2.38 2
0.1%
2.74 1
 
0.1%
2.77 1
 
0.1%
2.79 3
0.2%

UrineVol1
Real number (ℝ)

 Dataset ADataset B
Distinct388293
Distinct (%)9.0%22.4%
Missing3632
Missing (%)0.8%2.4%
Infinite00
Infinite (%)0.0%0.0%
Mean122.4377454116.5462892
 Dataset ADataset B
Minimum00
Maximum488410
Zeros51
Zeros (%)0.1%0.1%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:15.620738image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile1823
Q15155
median9793
Q3171157
95-th percentile310.8299
Maximum488410
Range488410
Interquartile range (IQR)120102

Descriptive statistics

 Dataset ADataset B
Standard deviation92.2317906483.12131556
Coefficient of variation (CV)0.75329540180.7132043081
Kurtosis0.53541381320.8691830965
Mean122.4377454116.5462892
Median Absolute Deviation (MAD)5447
Skewness1.0714539761.150515489
Sum530033152326
Variance8506.7032056909.153101
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:15.694415image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
38 58
 
1.3%
18 44
 
1.0%
60 43
 
1.0%
56 41
 
0.9%
68 40
 
0.9%
36 39
 
0.9%
106 36
 
0.8%
28 36
 
0.8%
40 35
 
0.8%
84 34
 
0.8%
Other values (378) 3923
89.9%
(Missing) 36
 
0.8%
ValueCountFrequency (%)
62 17
 
1.3%
64 16
 
1.2%
36 16
 
1.2%
87 16
 
1.2%
50 14
 
1.0%
70 14
 
1.0%
38 13
 
1.0%
31 13
 
1.0%
69 13
 
1.0%
52 13
 
1.0%
Other values (283) 1162
86.8%
(Missing) 32
 
2.4%
ValueCountFrequency (%)
0 5
0.1%
1 6
0.1%
3 3
0.1%
4 3
0.1%
5 5
0.1%
ValueCountFrequency (%)
0 1
 
0.1%
1 3
0.2%
2 1
 
0.1%
4 3
0.2%
5 2
0.1%
ValueCountFrequency (%)
0 1
 
< 0.1%
1 3
0.1%
2 1
 
< 0.1%
4 3
0.1%
5 2
< 0.1%
ValueCountFrequency (%)
0 5
0.4%
1 6
0.4%
3 3
0.2%
4 3
0.2%
5 5
0.4%

UrineFlow1
Real number (ℝ)

 Dataset ADataset B
Distinct1456716
Distinct (%)35.3%58.1%
Missing241107
Missing (%)5.5%8.0%
Infinite00
Infinite (%)0.0%0.0%
Mean1.0744097960.9526160714
 Dataset ADataset B
Minimum00
Maximum8.7312.346
Zeros21
Zeros (%)< 0.1%0.1%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:15.767460image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile0.190.16955
Q10.4610.397
median0.80.6775
Q31.355251.15825
95-th percentile2.94262.65625
Maximum8.7312.346
Range8.7312.346
Interquartile range (IQR)0.894250.76125

Descriptive statistics

 Dataset ADataset B
Standard deviation0.96014498860.9654069107
Coefficient of variation (CV)0.89364876591.013427066
Kurtosis9.64344248932.90750452
Mean1.0744097960.9526160714
Median Absolute Deviation (MAD)0.4050.3355
Skewness2.539890134.280652041
Sum4430.8661173.623
Variance0.92187839910.9320105032
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:15.841408image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 19
 
0.4%
0.37 16
 
0.4%
0.667 15
 
0.3%
0.646 14
 
0.3%
0.5 14
 
0.3%
0.811 14
 
0.3%
0.273 13
 
0.3%
1.214 13
 
0.3%
0.817 13
 
0.3%
0.686 12
 
0.3%
Other values (1446) 3981
91.2%
(Missing) 241
 
5.5%
ValueCountFrequency (%)
0.373 11
 
0.8%
0.5 9
 
0.7%
0.537 7
 
0.5%
0.803 6
 
0.4%
0.252 6
 
0.4%
0.297 6
 
0.4%
0.414 6
 
0.4%
0.287 6
 
0.4%
0.633 6
 
0.4%
0.89 6
 
0.4%
Other values (706) 1163
86.9%
(Missing) 107
 
8.0%
ValueCountFrequency (%)
0 2
< 0.1%
0.005 1
 
< 0.1%
0.006 1
 
< 0.1%
0.011 3
0.1%
0.014 1
 
< 0.1%
ValueCountFrequency (%)
0 1
0.1%
0.006 1
0.1%
0.011 1
0.1%
0.015 2
0.1%
0.032 2
0.1%
ValueCountFrequency (%)
0 1
< 0.1%
0.006 1
< 0.1%
0.011 1
< 0.1%
0.015 2
< 0.1%
0.032 2
< 0.1%
ValueCountFrequency (%)
0 2
0.1%
0.005 1
 
0.1%
0.006 1
 
0.1%
0.011 3
0.2%
0.014 1
 
0.1%

UrineVol2
Real number (ℝ)

 Dataset ADataset B
Distinct22783
Distinct (%)31.1%56.5%
Missing36341192
Missing (%)83.3%89.0%
Infinite00
Infinite (%)0.0%0.0%
Mean129.0930233119.9319728
 Dataset ADataset B
Minimum00
Maximum408409
Zeros72
Zeros (%)0.2%0.1%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:15.913882image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile2419.5
Q15652
median11089
Q3187151
95-th percentile310352
Maximum408409
Range408409
Interquartile range (IQR)13199

Descriptive statistics

 Dataset ADataset B
Standard deviation90.98429546100.2829326
Coefficient of variation (CV)0.70479637990.8361651216
Kurtosis0.271108021.265476379
Mean129.0930233119.9319728
Median Absolute Deviation (MAD)5847
Skewness0.94373077981.391308859
Sum9436717630
Variance8278.1420210056.66657
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:15.987849image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33 10
 
0.2%
82 10
 
0.2%
54 10
 
0.2%
110 9
 
0.2%
102 8
 
0.2%
64 8
 
0.2%
50 8
 
0.2%
118 7
 
0.2%
28 7
 
0.2%
53 7
 
0.2%
Other values (217) 647
 
14.8%
(Missing) 3634
83.3%
ValueCountFrequency (%)
68 8
 
0.6%
79 4
 
0.3%
103 3
 
0.2%
52 3
 
0.2%
409 3
 
0.2%
74 3
 
0.2%
30 3
 
0.2%
112 3
 
0.2%
108 3
 
0.2%
252 3
 
0.2%
Other values (73) 111
 
8.3%
(Missing) 1192
89.0%
ValueCountFrequency (%)
0 7
0.2%
1 3
0.1%
3 1
 
< 0.1%
4 1
 
< 0.1%
6 1
 
< 0.1%
ValueCountFrequency (%)
0 2
0.1%
8 2
0.1%
10 1
0.1%
12 1
0.1%
14 1
0.1%
ValueCountFrequency (%)
0 2
< 0.1%
8 2
< 0.1%
10 1
< 0.1%
12 1
< 0.1%
14 1
< 0.1%
ValueCountFrequency (%)
0 7
0.5%
1 3
0.2%
3 1
 
0.1%
4 1
 
0.1%
6 1
 
0.1%

UrineFlow2
Real number (ℝ)

 Dataset ADataset B
Distinct387110
Distinct (%)53.1%74.8%
Missing36361192
Missing (%)83.3%89.0%
Infinite00
Infinite (%)0.0%0.0%
Mean1.1711755831.023816327
 Dataset ADataset B
Minimum00
Maximum5.4744
Zeros72
Zeros (%)0.2%0.1%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:16.059813image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile0.2590.1799
Q10.5240.439
median0.8030.691
Q31.4851.365
95-th percentile3.0663.1966
Maximum5.4744
Range5.4744
Interquartile range (IQR)0.9610.926

Descriptive statistics

 Dataset ADataset B
Standard deviation0.99568876670.8757534136
Coefficient of variation (CV)0.85016182130.8553813715
Kurtosis3.4429297761.379013568
Mean1.1711755831.023816327
Median Absolute Deviation (MAD)0.380.333
Skewness1.7902814451.427808858
Sum853.787150.501
Variance0.99139612020.7669440414
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:16.129473image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.45 8
 
0.2%
0.55 8
 
0.2%
0 7
 
0.2%
0.482 6
 
0.1%
0.57 6
 
0.1%
0.298 6
 
0.1%
2.654 5
 
0.1%
0.42 5
 
0.1%
0.979 5
 
0.1%
0.882 5
 
0.1%
Other values (377) 668
 
15.3%
(Missing) 3636
83.3%
ValueCountFrequency (%)
0.532 4
 
0.3%
1 3
 
0.2%
0.265 3
 
0.2%
3.22 3
 
0.2%
0.586 3
 
0.2%
0.898 2
 
0.1%
1.617 2
 
0.1%
2.135 2
 
0.1%
0.485 2
 
0.1%
0.447 2
 
0.1%
Other values (100) 121
 
9.0%
(Missing) 1192
89.0%
ValueCountFrequency (%)
0 7
0.2%
0.018 1
 
< 0.1%
0.02 1
 
< 0.1%
0.022 2
 
< 0.1%
0.07 1
 
< 0.1%
ValueCountFrequency (%)
0 2
0.1%
0.057 2
0.1%
0.115 1
0.1%
0.139 1
0.1%
0.147 1
0.1%
ValueCountFrequency (%)
0 2
< 0.1%
0.057 2
< 0.1%
0.115 1
< 0.1%
0.139 1
< 0.1%
0.147 1
< 0.1%
ValueCountFrequency (%)
0 7
0.5%
0.018 1
 
0.1%
0.02 1
 
0.1%
0.022 2
 
0.1%
0.07 1
 
0.1%

Diabetes
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)< 0.1%0.1%
Missing02
Missing (%)0.0%0.1%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:16.178814image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length33
Median length22
Mean length2.0852233682.153328347
Min length22

Characters and Unicode

 Dataset ADataset B
Total characters9.1022.879
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowNoNo
2nd rowYesNo
3rd rowNoNo
4th rowNoNo
5th rowNoYes
ValueCountFrequency (%)
no 3993
91.5%
yes 372
 
8.5%
ValueCountFrequency (%)
no 1132
84.7%
yes 205
 
15.3%
2025-08-30T19:31:16.252567image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 3993
43.9%
o 3993
43.9%
Y 372
 
4.1%
e 372
 
4.1%
s 372
 
4.1%
ValueCountFrequency (%)
N 1132
39.3%
o 1132
39.3%
Y 205
 
7.1%
e 205
 
7.1%
s 205
 
7.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 9102
100.0%
ValueCountFrequency (%)
(unknown) 2879
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 3993
43.9%
o 3993
43.9%
Y 372
 
4.1%
e 372
 
4.1%
s 372
 
4.1%
ValueCountFrequency (%)
N 1132
39.3%
o 1132
39.3%
Y 205
 
7.1%
e 205
 
7.1%
s 205
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 9102
100.0%
ValueCountFrequency (%)
(unknown) 2879
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 3993
43.9%
o 3993
43.9%
Y 372
 
4.1%
e 372
 
4.1%
s 372
 
4.1%
ValueCountFrequency (%)
N 1132
39.3%
o 1132
39.3%
Y 205
 
7.1%
e 205
 
7.1%
s 205
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 9102
100.0%
ValueCountFrequency (%)
(unknown) 2879
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 3993
43.9%
o 3993
43.9%
Y 372
 
4.1%
e 372
 
4.1%
s 372
 
4.1%
ValueCountFrequency (%)
N 1132
39.3%
o 1132
39.3%
Y 205
 
7.1%
e 205
 
7.1%
s 205
 
7.1%

DiabetesAge
Real number (ℝ)

 Dataset ADataset B
Distinct5950
Distinct (%)20.2%28.4%
Missing40731163
Missing (%)93.3%86.9%
Infinite00
Infinite (%)0.0%0.0%
Mean49.7089041149.88636364
 Dataset ADataset B
Minimum76
Maximum7780
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:16.311376image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum76
5-th percentile20.5530
Q14440
median5050
Q35960
95-th percentile7275
Maximum7780
Range7074
Interquartile range (IQR)1520

Descriptive statistics

 Dataset ADataset B
Standard deviation14.1567388114.051279
Coefficient of variation (CV)0.28479281660.2816657294
Kurtosis0.60188980920.4598683502
Mean49.7089041149.88636364
Median Absolute Deviation (MAD)810
Skewness-0.6644629053-0.1727966237
Sum145158780
Variance200.4132538197.4384416
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:16.382216image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50 23
 
0.5%
49 19
 
0.4%
55 19
 
0.4%
60 17
 
0.4%
48 11
 
0.3%
45 10
 
0.2%
51 9
 
0.2%
40 9
 
0.2%
46 9
 
0.2%
35 8
 
0.2%
Other values (49) 158
 
3.6%
(Missing) 4073
93.3%
ValueCountFrequency (%)
55 12
 
0.9%
48 11
 
0.8%
40 11
 
0.8%
50 9
 
0.7%
45 9
 
0.7%
35 7
 
0.5%
62 7
 
0.5%
60 7
 
0.5%
58 6
 
0.4%
52 6
 
0.4%
Other values (40) 91
 
6.8%
(Missing) 1163
86.9%
ValueCountFrequency (%)
7 1
< 0.1%
8 1
< 0.1%
11 2
< 0.1%
12 1
< 0.1%
14 1
< 0.1%
ValueCountFrequency (%)
6 1
0.1%
7 1
0.1%
16 1
0.1%
18 2
0.1%
22 1
0.1%
ValueCountFrequency (%)
6 1
< 0.1%
7 1
< 0.1%
16 1
< 0.1%
18 2
< 0.1%
22 1
< 0.1%
ValueCountFrequency (%)
7 1
0.1%
8 1
0.1%
11 2
0.1%
12 1
0.1%
14 1
0.1%

HealthGen
['Text', 'Text']

 Dataset ADataset B
Distinct55
Distinct (%)0.1%0.4%
Missing420190
Missing (%)9.6%14.2%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:16.465117image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length99
Median length54
Mean length5.0871989864.476936466
Min length44

Characters and Unicode

 Dataset ADataset B
Total characters20.0695.144
Distinct characters1717
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowVgoodPoor
2nd rowGoodGood
3rd rowGoodGood
4th rowVgoodFair
5th rowFairGood
ValueCountFrequency (%)
vgood 1509
38.3%
good 1473
37.3%
excellent 556
 
14.1%
fair 334
 
8.5%
poor 73
 
1.9%
ValueCountFrequency (%)
good 466
40.6%
fair 340
29.6%
vgood 203
17.7%
poor 71
 
6.2%
excellent 69
 
6.0%
2025-08-30T19:31:16.580542image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 6110
30.4%
d 2982
14.9%
V 1509
 
7.5%
g 1509
 
7.5%
G 1473
 
7.3%
l 1112
 
5.5%
e 1112
 
5.5%
c 556
 
2.8%
x 556
 
2.8%
E 556
 
2.8%
Other values (7) 2594
12.9%
ValueCountFrequency (%)
o 1480
28.8%
d 669
13.0%
G 466
 
9.1%
r 411
 
8.0%
F 340
 
6.6%
a 340
 
6.6%
i 340
 
6.6%
g 203
 
3.9%
V 203
 
3.9%
e 138
 
2.7%
Other values (7) 554
 
10.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 20069
100.0%
ValueCountFrequency (%)
(unknown) 5144
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 6110
30.4%
d 2982
14.9%
V 1509
 
7.5%
g 1509
 
7.5%
G 1473
 
7.3%
l 1112
 
5.5%
e 1112
 
5.5%
c 556
 
2.8%
x 556
 
2.8%
E 556
 
2.8%
Other values (7) 2594
12.9%
ValueCountFrequency (%)
o 1480
28.8%
d 669
13.0%
G 466
 
9.1%
r 411
 
8.0%
F 340
 
6.6%
a 340
 
6.6%
i 340
 
6.6%
g 203
 
3.9%
V 203
 
3.9%
e 138
 
2.7%
Other values (7) 554
 
10.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 20069
100.0%
ValueCountFrequency (%)
(unknown) 5144
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 6110
30.4%
d 2982
14.9%
V 1509
 
7.5%
g 1509
 
7.5%
G 1473
 
7.3%
l 1112
 
5.5%
e 1112
 
5.5%
c 556
 
2.8%
x 556
 
2.8%
E 556
 
2.8%
Other values (7) 2594
12.9%
ValueCountFrequency (%)
o 1480
28.8%
d 669
13.0%
G 466
 
9.1%
r 411
 
8.0%
F 340
 
6.6%
a 340
 
6.6%
i 340
 
6.6%
g 203
 
3.9%
V 203
 
3.9%
e 138
 
2.7%
Other values (7) 554
 
10.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 20069
100.0%
ValueCountFrequency (%)
(unknown) 5144
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 6110
30.4%
d 2982
14.9%
V 1509
 
7.5%
g 1509
 
7.5%
G 1473
 
7.3%
l 1112
 
5.5%
e 1112
 
5.5%
c 556
 
2.8%
x 556
 
2.8%
E 556
 
2.8%
Other values (7) 2594
12.9%
ValueCountFrequency (%)
o 1480
28.8%
d 669
13.0%
G 466
 
9.1%
r 411
 
8.0%
F 340
 
6.6%
a 340
 
6.6%
i 340
 
6.6%
g 203
 
3.9%
V 203
 
3.9%
e 138
 
2.7%
Other values (7) 554
 
10.8%

DaysPhysHlthBad
Real number (ℝ)

 Dataset ADataset B
Distinct2728
Distinct (%)0.7%2.4%
Missing421194
Missing (%)9.6%14.5%
Infinite00
Infinite (%)0.0%0.0%
Mean3.091784995.180786026
 Dataset ADataset B
Minimum00
Maximum3030
Zeros2614704
Zeros (%)59.9%52.6%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:16.627074image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile00
Q100
median00
Q325
95-th percentile2130
Maximum3030
Range3030
Interquartile range (IQR)25

Descriptive statistics

 Dataset ADataset B
Standard deviation7.1018954919.495470824
Coefficient of variation (CV)2.2970211431.832824358
Kurtosis7.4638238151.911725581
Mean3.091784995.180786026
Median Absolute Deviation (MAD)00
Skewness2.8680459041.834858609
Sum121945932
Variance50.4369195690.16396616
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:16.679567image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
0 2614
59.9%
2 237
 
5.4%
30 177
 
4.1%
3 164
 
3.8%
1 136
 
3.1%
5 117
 
2.7%
4 111
 
2.5%
7 99
 
2.3%
10 58
 
1.3%
14 51
 
1.2%
Other values (17) 180
 
4.1%
(Missing) 421
 
9.6%
ValueCountFrequency (%)
0 704
52.6%
30 109
 
8.1%
2 50
 
3.7%
5 40
 
3.0%
3 33
 
2.5%
1 26
 
1.9%
7 25
 
1.9%
15 25
 
1.9%
4 24
 
1.8%
10 23
 
1.7%
Other values (18) 86
 
6.4%
(Missing) 194
 
14.5%
ValueCountFrequency (%)
0 2614
59.9%
1 136
 
3.1%
2 237
 
5.4%
3 164
 
3.8%
4 111
 
2.5%
ValueCountFrequency (%)
0 704
52.6%
1 26
 
1.9%
2 50
 
3.7%
3 33
 
2.5%
4 24
 
1.8%
ValueCountFrequency (%)
0 704
16.1%
1 26
 
0.6%
2 50
 
1.1%
3 33
 
0.8%
4 24
 
0.5%
ValueCountFrequency (%)
0 2614
195.2%
1 136
 
10.2%
2 237
 
17.7%
3 164
 
12.2%
4 111
 
8.3%

DaysMentHlthBad
Real number (ℝ)

 Dataset ADataset B
Distinct2725
Distinct (%)0.7%2.2%
Missing421192
Missing (%)9.6%14.3%
Infinite00
Infinite (%)0.0%0.0%
Mean3.9143002035.306887533
 Dataset ADataset B
Minimum00
Maximum3030
Zeros2269663
Zeros (%)52.0%49.5%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:16.730571image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile00
Q100
median00
Q345
95-th percentile3030
Maximum3030
Range3030
Interquartile range (IQR)45

Descriptive statistics

 Dataset ADataset B
Standard deviation7.6992846359.44655101
Coefficient of variation (CV)1.9669632471.780054873
Kurtosis4.928356451.91590455
Mean3.9143002035.306887533
Median Absolute Deviation (MAD)00
Skewness2.4159733791.823220326
Sum154386087
Variance59.2789838989.23732599
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:16.781621image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
0 2269
52.0%
2 276
 
6.3%
30 205
 
4.7%
3 200
 
4.6%
5 176
 
4.0%
1 172
 
3.9%
4 146
 
3.3%
10 98
 
2.2%
7 82
 
1.9%
15 80
 
1.8%
Other values (17) 240
 
5.5%
(Missing) 421
 
9.6%
ValueCountFrequency (%)
0 663
49.5%
30 108
 
8.1%
2 56
 
4.2%
5 49
 
3.7%
3 41
 
3.1%
1 40
 
3.0%
15 32
 
2.4%
4 26
 
1.9%
10 25
 
1.9%
14 24
 
1.8%
Other values (15) 83
 
6.2%
(Missing) 192
 
14.3%
ValueCountFrequency (%)
0 2269
52.0%
1 172
 
3.9%
2 276
 
6.3%
3 200
 
4.6%
4 146
 
3.3%
ValueCountFrequency (%)
0 663
49.5%
1 40
 
3.0%
2 56
 
4.2%
3 41
 
3.1%
4 26
 
1.9%
ValueCountFrequency (%)
0 663
15.2%
1 40
 
0.9%
2 56
 
1.3%
3 41
 
0.9%
4 26
 
0.6%
ValueCountFrequency (%)
0 2269
169.5%
1 172
 
12.8%
2 276
 
20.6%
3 200
 
14.9%
4 146
 
10.9%

LittleInterest
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)0.2%0.6%
Missing35411010
Missing (%)81.1%75.4%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:16.852294image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length77
Median length77
Mean length6.271844665.76899696
Min length44

Characters and Unicode

 Dataset ADataset B
Total characters5.1681.898
Distinct characters1010
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowMostSeveral
2nd rowSeveralSeveral
3rd rowSeveralSeveral
4th rowSeveralMost
5th rowMostSeveral
ValueCountFrequency (%)
several 624
75.7%
most 200
 
24.3%
ValueCountFrequency (%)
several 194
59.0%
most 135
41.0%
2025-08-30T19:31:16.963568image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1248
24.1%
S 624
12.1%
v 624
12.1%
r 624
12.1%
a 624
12.1%
l 624
12.1%
M 200
 
3.9%
o 200
 
3.9%
s 200
 
3.9%
t 200
 
3.9%
ValueCountFrequency (%)
e 388
20.4%
S 194
10.2%
v 194
10.2%
r 194
10.2%
a 194
10.2%
l 194
10.2%
M 135
 
7.1%
o 135
 
7.1%
s 135
 
7.1%
t 135
 
7.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 5168
100.0%
ValueCountFrequency (%)
(unknown) 1898
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 1248
24.1%
S 624
12.1%
v 624
12.1%
r 624
12.1%
a 624
12.1%
l 624
12.1%
M 200
 
3.9%
o 200
 
3.9%
s 200
 
3.9%
t 200
 
3.9%
ValueCountFrequency (%)
e 388
20.4%
S 194
10.2%
v 194
10.2%
r 194
10.2%
a 194
10.2%
l 194
10.2%
M 135
 
7.1%
o 135
 
7.1%
s 135
 
7.1%
t 135
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 5168
100.0%
ValueCountFrequency (%)
(unknown) 1898
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 1248
24.1%
S 624
12.1%
v 624
12.1%
r 624
12.1%
a 624
12.1%
l 624
12.1%
M 200
 
3.9%
o 200
 
3.9%
s 200
 
3.9%
t 200
 
3.9%
ValueCountFrequency (%)
e 388
20.4%
S 194
10.2%
v 194
10.2%
r 194
10.2%
a 194
10.2%
l 194
10.2%
M 135
 
7.1%
o 135
 
7.1%
s 135
 
7.1%
t 135
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 5168
100.0%
ValueCountFrequency (%)
(unknown) 1898
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 1248
24.1%
S 624
12.1%
v 624
12.1%
r 624
12.1%
a 624
12.1%
l 624
12.1%
M 200
 
3.9%
o 200
 
3.9%
s 200
 
3.9%
t 200
 
3.9%
ValueCountFrequency (%)
e 388
20.4%
S 194
10.2%
v 194
10.2%
r 194
10.2%
a 194
10.2%
l 194
10.2%
M 135
 
7.1%
o 135
 
7.1%
s 135
 
7.1%
t 135
 
7.1%

Depressed
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)0.3%0.6%
Missing36371013
Missing (%)83.3%75.7%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:17.021929image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length77
Median length77
Mean length6.2829670335.619631902
Min length44

Characters and Unicode

 Dataset ADataset B
Total characters4.5741.832
Distinct characters1010
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowMostSeveral
2nd rowSeveralMost
3rd rowMostMost
4th rowSeveralSeveral
5th rowSeveralMost
ValueCountFrequency (%)
several 554
76.1%
most 174
 
23.9%
ValueCountFrequency (%)
several 176
54.0%
most 150
46.0%
2025-08-30T19:31:17.124777image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1108
24.2%
S 554
12.1%
v 554
12.1%
r 554
12.1%
a 554
12.1%
l 554
12.1%
M 174
 
3.8%
o 174
 
3.8%
s 174
 
3.8%
t 174
 
3.8%
ValueCountFrequency (%)
e 352
19.2%
S 176
9.6%
v 176
9.6%
r 176
9.6%
a 176
9.6%
l 176
9.6%
M 150
8.2%
o 150
8.2%
s 150
8.2%
t 150
8.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4574
100.0%
ValueCountFrequency (%)
(unknown) 1832
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 1108
24.2%
S 554
12.1%
v 554
12.1%
r 554
12.1%
a 554
12.1%
l 554
12.1%
M 174
 
3.8%
o 174
 
3.8%
s 174
 
3.8%
t 174
 
3.8%
ValueCountFrequency (%)
e 352
19.2%
S 176
9.6%
v 176
9.6%
r 176
9.6%
a 176
9.6%
l 176
9.6%
M 150
8.2%
o 150
8.2%
s 150
8.2%
t 150
8.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4574
100.0%
ValueCountFrequency (%)
(unknown) 1832
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 1108
24.2%
S 554
12.1%
v 554
12.1%
r 554
12.1%
a 554
12.1%
l 554
12.1%
M 174
 
3.8%
o 174
 
3.8%
s 174
 
3.8%
t 174
 
3.8%
ValueCountFrequency (%)
e 352
19.2%
S 176
9.6%
v 176
9.6%
r 176
9.6%
a 176
9.6%
l 176
9.6%
M 150
8.2%
o 150
8.2%
s 150
8.2%
t 150
8.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4574
100.0%
ValueCountFrequency (%)
(unknown) 1832
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 1108
24.2%
S 554
12.1%
v 554
12.1%
r 554
12.1%
a 554
12.1%
l 554
12.1%
M 174
 
3.8%
o 174
 
3.8%
s 174
 
3.8%
t 174
 
3.8%
ValueCountFrequency (%)
e 352
19.2%
S 176
9.6%
v 176
9.6%
r 176
9.6%
a 176
9.6%
l 176
9.6%
M 150
8.2%
o 150
8.2%
s 150
8.2%
t 150
8.2%

nPregnancies
Real number (ℝ)

 Dataset ADataset B
Distinct1312
Distinct (%)0.8%2.7%
Missing2812887
Missing (%)64.4%66.2%
Infinite00
Infinite (%)0.0%0.0%
Mean2.8506117193.60619469
 Dataset ADataset B
Minimum11
Maximum3212
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:17.160673image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum11
5-th percentile11
Q122
median33
Q345
95-th percentile67
Maximum3212
Range3111
Interquartile range (IQR)23

Descriptive statistics

 Dataset ADataset B
Standard deviation1.7382414091.995506401
Coefficient of variation (CV)0.60977838440.5533551493
Kurtosis52.461991222.174666885
Mean2.8506117193.60619469
Median Absolute Deviation (MAD)11
Skewness4.0297187761.290770789
Sum44271630
Variance3.0214831953.982045798
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:17.199810image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
2 466
 
10.7%
3 365
 
8.4%
1 298
 
6.8%
4 210
 
4.8%
5 127
 
2.9%
6 48
 
1.1%
7 21
 
0.5%
8 8
 
0.2%
11 4
 
0.1%
10 3
 
0.1%
Other values (3) 3
 
0.1%
(Missing) 2812
64.4%
ValueCountFrequency (%)
3 113
 
8.4%
2 103
 
7.7%
4 78
 
5.8%
5 44
 
3.3%
1 43
 
3.2%
6 38
 
2.8%
7 13
 
1.0%
9 8
 
0.6%
10 6
 
0.4%
8 3
 
0.2%
Other values (2) 3
 
0.2%
(Missing) 887
66.2%
ValueCountFrequency (%)
1 298
6.8%
2 466
10.7%
3 365
8.4%
4 210
4.8%
5 127
 
2.9%
ValueCountFrequency (%)
1 43
 
3.2%
2 103
7.7%
3 113
8.4%
4 78
5.8%
5 44
 
3.3%
ValueCountFrequency (%)
1 43
 
1.0%
2 103
2.4%
3 113
2.6%
4 78
1.8%
5 44
 
1.0%
ValueCountFrequency (%)
1 298
22.3%
2 466
34.8%
3 365
27.3%
4 210
15.7%
5 127
 
9.5%

nBabies
Real number (ℝ)

 Dataset ADataset B
Distinct1012
Distinct (%)0.7%2.8%
Missing2945909
Missing (%)67.5%67.9%
Infinite00
Infinite (%)0.0%0.0%
Mean2.2316901413.037209302
 Dataset ADataset B
Minimum00
Maximum1112
Zeros43
Zeros (%)0.1%0.2%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:17.238714image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile11
Q122
median23
Q334
95-th percentile46
Maximum1112
Range1112
Interquartile range (IQR)12

Descriptive statistics

 Dataset ADataset B
Standard deviation1.096079871.716099993
Coefficient of variation (CV)0.49114339380.5650252658
Kurtosis4.8476874533.242262259
Mean2.2316901413.037209302
Median Absolute Deviation (MAD)11
Skewness1.4675746251.435070863
Sum31691306
Variance1.2013910812.944999187
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:17.282727image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2 616
 
14.1%
1 346
 
7.9%
3 309
 
7.1%
4 93
 
2.1%
5 32
 
0.7%
6 13
 
0.3%
7 5
 
0.1%
0 4
 
0.1%
11 1
 
< 0.1%
8 1
 
< 0.1%
(Missing) 2945
67.5%
ValueCountFrequency (%)
2 131
 
9.8%
3 112
 
8.4%
1 58
 
4.3%
4 57
 
4.3%
5 30
 
2.2%
6 22
 
1.6%
7 6
 
0.4%
8 6
 
0.4%
9 3
 
0.2%
0 3
 
0.2%
Other values (2) 2
 
0.1%
(Missing) 909
67.9%
ValueCountFrequency (%)
0 4
 
0.1%
1 346
7.9%
2 616
14.1%
3 309
7.1%
4 93
 
2.1%
ValueCountFrequency (%)
0 3
 
0.2%
1 58
4.3%
2 131
9.8%
3 112
8.4%
4 57
4.3%
ValueCountFrequency (%)
0 3
 
0.1%
1 58
1.3%
2 131
3.0%
3 112
2.6%
4 57
1.3%
ValueCountFrequency (%)
0 4
 
0.3%
1 346
25.8%
2 616
46.0%
3 309
23.1%
4 93
 
6.9%

Age1stBaby
Real number (ℝ)

 Dataset ADataset B
Distinct2520
Distinct (%)2.3%5.4%
Missing3295972
Missing (%)75.5%72.6%
Infinite00
Infinite (%)0.0%0.0%
Mean24.2457943920.0626703
 Dataset ADataset B
Minimum1414
Maximum3937
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:17.330185image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum1414
5-th percentile1715
Q12117
median2419
Q32822
95-th percentile3328
Maximum3937
Range2523
Interquartile range (IQR)75

Descriptive statistics

 Dataset ADataset B
Standard deviation4.8685581873.789386452
Coefficient of variation (CV)0.20080011020.1888774722
Kurtosis-0.4320242692.036736448
Mean24.2457943920.0626703
Median Absolute Deviation (MAD)42
Skewness0.31187218041.196598794
Sum259437363
Variance23.7028588214.35944968
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:17.378967image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
23 102
 
2.3%
24 89
 
2.0%
21 87
 
2.0%
26 70
 
1.6%
20 69
 
1.6%
25 67
 
1.5%
19 61
 
1.4%
29 60
 
1.4%
22 59
 
1.4%
28 57
 
1.3%
Other values (15) 349
 
8.0%
(Missing) 3295
75.5%
ValueCountFrequency (%)
17 53
 
4.0%
18 45
 
3.4%
22 43
 
3.2%
19 43
 
3.2%
20 32
 
2.4%
21 28
 
2.1%
16 27
 
2.0%
23 21
 
1.6%
15 17
 
1.3%
24 15
 
1.1%
Other values (10) 43
 
3.2%
(Missing) 972
72.6%
ValueCountFrequency (%)
14 3
 
0.1%
15 14
 
0.3%
16 31
0.7%
17 28
0.6%
18 53
1.2%
ValueCountFrequency (%)
14 6
 
0.4%
15 17
 
1.3%
16 27
2.0%
17 53
4.0%
18 45
3.4%
ValueCountFrequency (%)
14 6
 
0.1%
15 17
 
0.4%
16 27
0.6%
17 53
1.2%
18 45
1.0%
ValueCountFrequency (%)
14 3
 
0.2%
15 14
 
1.0%
16 31
2.3%
17 28
2.1%
18 53
4.0%

SleepHrsNight
Real number (ℝ)

 Dataset ADataset B
Distinct1111
Distinct (%)0.3%0.8%
Missing114
Missing (%)0.3%0.3%
Infinite00
Infinite (%)0.0%0.0%
Mean6.9503904466.81423221
 Dataset ADataset B
Minimum22
Maximum1212
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:17.421208image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum22
5-th percentile54
Q166
median77
Q388
95-th percentile99
Maximum1212
Range1010
Interquartile range (IQR)22

Descriptive statistics

 Dataset ADataset B
Standard deviation1.2488712421.506091974
Coefficient of variation (CV)0.17968360940.221021522
Kurtosis1.0411023670.360634978
Mean6.9503904466.81423221
Median Absolute Deviation (MAD)11
Skewness-0.2028340671-0.2279047704
Sum302629097
Variance1.5596793792.268313034
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:17.461694image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
7 1401
32.1%
8 1221
28.0%
6 982
22.5%
5 309
 
7.1%
9 218
 
5.0%
4 129
 
3.0%
10 47
 
1.1%
3 24
 
0.5%
12 13
 
0.3%
11 6
 
0.1%
(Missing) 11
 
0.3%
ValueCountFrequency (%)
8 378
28.2%
6 328
24.5%
7 300
22.4%
5 119
 
8.9%
4 74
 
5.5%
9 63
 
4.7%
10 37
 
2.8%
3 22
 
1.6%
11 7
 
0.5%
2 5
 
0.4%
(Missing) 4
 
0.3%
ValueCountFrequency (%)
2 4
 
0.1%
3 24
 
0.5%
4 129
 
3.0%
5 309
 
7.1%
6 982
22.5%
ValueCountFrequency (%)
2 5
 
0.4%
3 22
 
1.6%
4 74
 
5.5%
5 119
 
8.9%
6 328
24.5%
ValueCountFrequency (%)
2 5
 
0.1%
3 22
 
0.5%
4 74
 
1.7%
5 119
 
2.7%
6 328
7.5%
ValueCountFrequency (%)
2 4
 
0.3%
3 24
 
1.8%
4 129
 
9.6%
5 309
 
23.1%
6 982
73.3%

SleepTrouble
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)< 0.1%0.1%
Missing00
Missing (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:17.505689image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length33
Median length22
Mean length2.2641466212.249439881
Min length22

Characters and Unicode

 Dataset ADataset B
Total characters9.8833.012
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowNoNo
2nd rowYesYes
3rd rowNoNo
4th rowNoNo
5th rowNoNo
ValueCountFrequency (%)
no 3212
73.6%
yes 1153
 
26.4%
ValueCountFrequency (%)
no 1005
75.1%
yes 334
 
24.9%
2025-08-30T19:31:17.591287image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 3212
32.5%
o 3212
32.5%
Y 1153
 
11.7%
e 1153
 
11.7%
s 1153
 
11.7%
ValueCountFrequency (%)
N 1005
33.4%
o 1005
33.4%
Y 334
 
11.1%
e 334
 
11.1%
s 334
 
11.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 9883
100.0%
ValueCountFrequency (%)
(unknown) 3012
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 3212
32.5%
o 3212
32.5%
Y 1153
 
11.7%
e 1153
 
11.7%
s 1153
 
11.7%
ValueCountFrequency (%)
N 1005
33.4%
o 1005
33.4%
Y 334
 
11.1%
e 334
 
11.1%
s 334
 
11.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 9883
100.0%
ValueCountFrequency (%)
(unknown) 3012
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 3212
32.5%
o 3212
32.5%
Y 1153
 
11.7%
e 1153
 
11.7%
s 1153
 
11.7%
ValueCountFrequency (%)
N 1005
33.4%
o 1005
33.4%
Y 334
 
11.1%
e 334
 
11.1%
s 334
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 9883
100.0%
ValueCountFrequency (%)
(unknown) 3012
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 3212
32.5%
o 3212
32.5%
Y 1153
 
11.7%
e 1153
 
11.7%
s 1153
 
11.7%
ValueCountFrequency (%)
N 1005
33.4%
o 1005
33.4%
Y 334
 
11.1%
e 334
 
11.1%
s 334
 
11.1%

PhysActive
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)< 0.1%0.1%
Missing00
Missing (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:17.632892image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length33
Median length32
Mean length2.6329896912.291262136
Min length22

Characters and Unicode

 Dataset ADataset B
Total characters11.4933.068
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowYesNo
2nd rowNoYes
3rd rowNoNo
4th rowNoNo
5th rowYesNo
ValueCountFrequency (%)
yes 2763
63.3%
no 1602
36.7%
ValueCountFrequency (%)
no 949
70.9%
yes 390
29.1%
2025-08-30T19:31:17.713764image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Y 2763
24.0%
e 2763
24.0%
s 2763
24.0%
N 1602
13.9%
o 1602
13.9%
ValueCountFrequency (%)
N 949
30.9%
o 949
30.9%
Y 390
12.7%
e 390
12.7%
s 390
12.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11493
100.0%
ValueCountFrequency (%)
(unknown) 3068
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
Y 2763
24.0%
e 2763
24.0%
s 2763
24.0%
N 1602
13.9%
o 1602
13.9%
ValueCountFrequency (%)
N 949
30.9%
o 949
30.9%
Y 390
12.7%
e 390
12.7%
s 390
12.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11493
100.0%
ValueCountFrequency (%)
(unknown) 3068
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
Y 2763
24.0%
e 2763
24.0%
s 2763
24.0%
N 1602
13.9%
o 1602
13.9%
ValueCountFrequency (%)
N 949
30.9%
o 949
30.9%
Y 390
12.7%
e 390
12.7%
s 390
12.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11493
100.0%
ValueCountFrequency (%)
(unknown) 3068
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
Y 2763
24.0%
e 2763
24.0%
s 2763
24.0%
N 1602
13.9%
o 1602
13.9%
ValueCountFrequency (%)
N 949
30.9%
o 949
30.9%
Y 390
12.7%
e 390
12.7%
s 390
12.7%

PhysActiveDays
Real number (ℝ)

 Dataset ADataset B
Distinct77
Distinct (%)0.3%1.4%
Missing1951821
Missing (%)44.7%61.3%
Infinite00
Infinite (%)0.0%0.0%
Mean3.6930405973.666023166
 Dataset ADataset B
Minimum11
Maximum77
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:17.746620image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum11
5-th percentile11
Q122
median33
Q355
95-th percentile77
Maximum77
Range66
Interquartile range (IQR)33

Descriptive statistics

 Dataset ADataset B
Standard deviation1.8111803491.916649038
Coefficient of variation (CV)0.49043066320.5228142189
Kurtosis-0.8321744695-0.9614052895
Mean3.6930405973.666023166
Median Absolute Deviation (MAD)11
Skewness0.34682097760.4013099512
Sum89151899
Variance3.2803742573.673543535
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:17.780482image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
3 617
 
14.1%
5 396
 
9.1%
2 389
 
8.9%
4 316
 
7.2%
1 282
 
6.5%
7 276
 
6.3%
6 138
 
3.2%
(Missing) 1951
44.7%
ValueCountFrequency (%)
2 112
 
8.4%
3 100
 
7.5%
5 75
 
5.6%
7 74
 
5.5%
4 70
 
5.2%
1 64
 
4.8%
6 23
 
1.7%
(Missing) 821
61.3%
ValueCountFrequency (%)
1 282
6.5%
2 389
8.9%
3 617
14.1%
4 316
7.2%
5 396
9.1%
ValueCountFrequency (%)
1 64
4.8%
2 112
8.4%
3 100
7.5%
4 70
5.2%
5 75
5.6%
ValueCountFrequency (%)
1 64
1.5%
2 112
2.6%
3 100
2.3%
4 70
1.6%
5 75
1.7%
ValueCountFrequency (%)
1 282
21.1%
2 389
29.1%
3 617
46.1%
4 316
23.6%
5 396
29.6%

TVHrsDay
['Text', 'Text']

 Dataset ADataset B
Distinct77
Distinct (%)0.3%1.1%
Missing2078723
Missing (%)47.6%54.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:17.856215image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length99
Median length44
Mean length5.3148229125.563311688
Min length44

Characters and Unicode

 Dataset ADataset B
Total characters12.1553.427
Distinct characters1313
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st row3_hr4_hr
2nd rowMore_4_hrMore_4_hr
3rd row2_hr2_hr
4th row3_hr0_to_1_hr
5th row2_hrMore_4_hr
ValueCountFrequency (%)
2_hr 628
27.5%
1_hr 414
18.1%
3_hr 377
16.5%
0_to_1_hr 349
15.3%
more_4_hr 243
 
10.6%
4_hr 229
 
10.0%
0_hrs 47
 
2.1%
ValueCountFrequency (%)
more_4_hr 143
23.2%
3_hr 120
19.5%
2_hr 116
18.8%
1_hr 93
15.1%
4_hr 80
13.0%
0_to_1_hr 46
 
7.5%
0_hrs 18
 
2.9%
2025-08-30T19:31:17.972103image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 3228
26.6%
r 2530
20.8%
h 2287
18.8%
1 763
 
6.3%
2 628
 
5.2%
o 592
 
4.9%
4 472
 
3.9%
0 396
 
3.3%
3 377
 
3.1%
t 349
 
2.9%
Other values (3) 533
 
4.4%
ValueCountFrequency (%)
_ 851
24.8%
r 759
22.1%
h 616
18.0%
4 223
 
6.5%
o 189
 
5.5%
M 143
 
4.2%
e 143
 
4.2%
1 139
 
4.1%
3 120
 
3.5%
2 116
 
3.4%
Other values (3) 128
 
3.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 12155
100.0%
ValueCountFrequency (%)
(unknown) 3427
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
_ 3228
26.6%
r 2530
20.8%
h 2287
18.8%
1 763
 
6.3%
2 628
 
5.2%
o 592
 
4.9%
4 472
 
3.9%
0 396
 
3.3%
3 377
 
3.1%
t 349
 
2.9%
Other values (3) 533
 
4.4%
ValueCountFrequency (%)
_ 851
24.8%
r 759
22.1%
h 616
18.0%
4 223
 
6.5%
o 189
 
5.5%
M 143
 
4.2%
e 143
 
4.2%
1 139
 
4.1%
3 120
 
3.5%
2 116
 
3.4%
Other values (3) 128
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 12155
100.0%
ValueCountFrequency (%)
(unknown) 3427
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
_ 3228
26.6%
r 2530
20.8%
h 2287
18.8%
1 763
 
6.3%
2 628
 
5.2%
o 592
 
4.9%
4 472
 
3.9%
0 396
 
3.3%
3 377
 
3.1%
t 349
 
2.9%
Other values (3) 533
 
4.4%
ValueCountFrequency (%)
_ 851
24.8%
r 759
22.1%
h 616
18.0%
4 223
 
6.5%
o 189
 
5.5%
M 143
 
4.2%
e 143
 
4.2%
1 139
 
4.1%
3 120
 
3.5%
2 116
 
3.4%
Other values (3) 128
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 12155
100.0%
ValueCountFrequency (%)
(unknown) 3427
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
_ 3228
26.6%
r 2530
20.8%
h 2287
18.8%
1 763
 
6.3%
2 628
 
5.2%
o 592
 
4.9%
4 472
 
3.9%
0 396
 
3.3%
3 377
 
3.1%
t 349
 
2.9%
Other values (3) 533
 
4.4%
ValueCountFrequency (%)
_ 851
24.8%
r 759
22.1%
h 616
18.0%
4 223
 
6.5%
o 189
 
5.5%
M 143
 
4.2%
e 143
 
4.2%
1 139
 
4.1%
3 120
 
3.5%
2 116
 
3.4%
Other values (3) 128
 
3.7%

CompHrsDay
['Text', 'Text']

 Dataset ADataset B
Distinct77
Distinct (%)0.3%1.1%
Missing2077722
Missing (%)47.6%53.9%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:18.036057image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length99
Median length45
Mean length5.9143356645.883306321
Min length44

Characters and Unicode

 Dataset ADataset B
Total characters13.5323.630
Distinct characters1313
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st row1_hr2_hr
2nd rowMore_4_hr0_hrs
3rd row0_to_1_hr0_hrs
4th row1_hr0_hrs
5th row0_to_1_hr2_hr
ValueCountFrequency (%)
0_to_1_hr 664
29.0%
1_hr 579
25.3%
2_hr 327
14.3%
0_hrs 270
11.8%
3_hr 169
 
7.4%
more_4_hr 158
 
6.9%
4_hr 121
 
5.3%
ValueCountFrequency (%)
0_hrs 317
51.4%
0_to_1_hr 148
24.0%
1_hr 49
 
7.9%
2_hr 45
 
7.3%
3_hr 31
 
5.0%
more_4_hr 21
 
3.4%
4_hr 6
 
1.0%
2025-08-30T19:31:18.146982image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 3774
27.9%
r 2446
18.1%
h 2288
16.9%
1 1243
 
9.2%
0 934
 
6.9%
o 822
 
6.1%
t 664
 
4.9%
2 327
 
2.4%
4 279
 
2.1%
s 270
 
2.0%
Other values (3) 485
 
3.6%
ValueCountFrequency (%)
_ 934
25.7%
r 638
17.6%
h 617
17.0%
0 465
12.8%
s 317
 
8.7%
1 197
 
5.4%
o 169
 
4.7%
t 148
 
4.1%
2 45
 
1.2%
3 31
 
0.9%
Other values (3) 69
 
1.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 13532
100.0%
ValueCountFrequency (%)
(unknown) 3630
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
_ 3774
27.9%
r 2446
18.1%
h 2288
16.9%
1 1243
 
9.2%
0 934
 
6.9%
o 822
 
6.1%
t 664
 
4.9%
2 327
 
2.4%
4 279
 
2.1%
s 270
 
2.0%
Other values (3) 485
 
3.6%
ValueCountFrequency (%)
_ 934
25.7%
r 638
17.6%
h 617
17.0%
0 465
12.8%
s 317
 
8.7%
1 197
 
5.4%
o 169
 
4.7%
t 148
 
4.1%
2 45
 
1.2%
3 31
 
0.9%
Other values (3) 69
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 13532
100.0%
ValueCountFrequency (%)
(unknown) 3630
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
_ 3774
27.9%
r 2446
18.1%
h 2288
16.9%
1 1243
 
9.2%
0 934
 
6.9%
o 822
 
6.1%
t 664
 
4.9%
2 327
 
2.4%
4 279
 
2.1%
s 270
 
2.0%
Other values (3) 485
 
3.6%
ValueCountFrequency (%)
_ 934
25.7%
r 638
17.6%
h 617
17.0%
0 465
12.8%
s 317
 
8.7%
1 197
 
5.4%
o 169
 
4.7%
t 148
 
4.1%
2 45
 
1.2%
3 31
 
0.9%
Other values (3) 69
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 13532
100.0%
ValueCountFrequency (%)
(unknown) 3630
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
_ 3774
27.9%
r 2446
18.1%
h 2288
16.9%
1 1243
 
9.2%
0 934
 
6.9%
o 822
 
6.1%
t 664
 
4.9%
2 327
 
2.4%
4 279
 
2.1%
s 270
 
2.0%
Other values (3) 485
 
3.6%
ValueCountFrequency (%)
_ 934
25.7%
r 638
17.6%
h 617
17.0%
0 465
12.8%
s 317
 
8.7%
1 197
 
5.4%
o 169
 
4.7%
t 148
 
4.1%
2 45
 
1.2%
3 31
 
0.9%
Other values (3) 69
 
1.9%

TVHrsDayChild
[None, None]

 Dataset ADataset B
Missing43651339
Missing (%)100.0%100.0%
Memory size197.2 KiB53.2 KiB

CompHrsDayChild
[None, None]

 Dataset ADataset B
Missing43651339
Missing (%)100.0%100.0%
Memory size197.2 KiB53.2 KiB

Alcohol12PlusYr
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)0.1%0.2%
Missing423199
Missing (%)9.7%14.9%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:18.185732image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length33
Median length33
Mean length2.8206494172.736842105
Min length22

Characters and Unicode

 Dataset ADataset B
Total characters11.1193.120
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowYesYes
2nd rowYesYes
3rd rowYesYes
4th rowYesNo
5th rowNoYes
ValueCountFrequency (%)
yes 3235
82.1%
no 707
 
17.9%
ValueCountFrequency (%)
yes 840
73.7%
no 300
 
26.3%
2025-08-30T19:31:18.266068image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Y 3235
29.1%
e 3235
29.1%
s 3235
29.1%
N 707
 
6.4%
o 707
 
6.4%
ValueCountFrequency (%)
Y 840
26.9%
e 840
26.9%
s 840
26.9%
N 300
 
9.6%
o 300
 
9.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11119
100.0%
ValueCountFrequency (%)
(unknown) 3120
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
Y 3235
29.1%
e 3235
29.1%
s 3235
29.1%
N 707
 
6.4%
o 707
 
6.4%
ValueCountFrequency (%)
Y 840
26.9%
e 840
26.9%
s 840
26.9%
N 300
 
9.6%
o 300
 
9.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11119
100.0%
ValueCountFrequency (%)
(unknown) 3120
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
Y 3235
29.1%
e 3235
29.1%
s 3235
29.1%
N 707
 
6.4%
o 707
 
6.4%
ValueCountFrequency (%)
Y 840
26.9%
e 840
26.9%
s 840
26.9%
N 300
 
9.6%
o 300
 
9.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11119
100.0%
ValueCountFrequency (%)
(unknown) 3120
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
Y 3235
29.1%
e 3235
29.1%
s 3235
29.1%
N 707
 
6.4%
o 707
 
6.4%
ValueCountFrequency (%)
Y 840
26.9%
e 840
26.9%
s 840
26.9%
N 300
 
9.6%
o 300
 
9.6%

AlcoholDay
Real number (ℝ)

 Dataset ADataset B
Distinct1718
Distinct (%)0.5%2.5%
Missing1208626
Missing (%)27.7%46.8%
Infinite00
Infinite (%)0.0%0.0%
Mean2.4421919544.100981767
 Dataset ADataset B
Minimum11
Maximum3082
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:18.308257image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum11
5-th percentile11
Q112
median23
Q336
95-th percentile612
Maximum3082
Range2981
Interquartile range (IQR)24

Descriptive statistics

 Dataset ADataset B
Standard deviation2.0609148694.527805684
Coefficient of variation (CV)0.84387914951.104078472
Kurtosis28.3006606123.7077335
Mean2.4421919544.100981767
Median Absolute Deviation (MAD)12
Skewness3.7535140547.999501334
Sum77102924
Variance4.24737009920.50102432
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:18.353001image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
1 1202
27.5%
2 958
21.9%
3 409
 
9.4%
4 231
 
5.3%
6 133
 
3.0%
5 121
 
2.8%
7 28
 
0.6%
8 23
 
0.5%
12 21
 
0.5%
10 14
 
0.3%
Other values (7) 17
 
0.4%
(Missing) 1208
27.7%
ValueCountFrequency (%)
1 163
 
12.2%
2 162
 
12.1%
3 103
 
7.7%
6 77
 
5.8%
4 52
 
3.9%
5 50
 
3.7%
12 31
 
2.3%
10 24
 
1.8%
8 18
 
1.3%
7 16
 
1.2%
Other values (8) 17
 
1.3%
(Missing) 626
46.8%
ValueCountFrequency (%)
1 1202
27.5%
2 958
21.9%
3 409
 
9.4%
4 231
 
5.3%
5 121
 
2.8%
ValueCountFrequency (%)
1 163
12.2%
2 162
12.1%
3 103
7.7%
4 52
 
3.9%
5 50
 
3.7%
ValueCountFrequency (%)
1 163
3.7%
2 162
3.7%
3 103
2.4%
4 52
 
1.2%
5 50
 
1.1%
ValueCountFrequency (%)
1 1202
89.8%
2 958
71.5%
3 409
 
30.5%
4 231
 
17.3%
5 121
 
9.0%

AlcoholYear
Real number (ℝ)

 Dataset ADataset B
Distinct5837
Distinct (%)1.6%3.8%
Missing733355
Missing (%)16.8%26.5%
Infinite00
Infinite (%)0.0%0.0%
Mean82.753579359.05894309
 Dataset ADataset B
Minimum00
Maximum364364
Zeros473270
Zeros (%)10.8%20.2%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:18.410346image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile00
Q150
median3612
Q310452
95-th percentile364364
Maximum364364
Range364364
Interquartile range (IQR)9952

Descriptive statistics

 Dataset ADataset B
Standard deviation104.0034643101.0941293
Coefficient of variation (CV)1.2567850861.711749719
Kurtosis0.92618154923.247306627
Mean82.753579359.05894309
Median Absolute Deviation (MAD)3512
Skewness1.4024146722.077355401
Sum30056158114
Variance10816.7205910220.02297
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:18.476851image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 473
10.8%
52 403
9.2%
104 355
 
8.1%
12 321
 
7.4%
24 299
 
6.8%
156 240
 
5.5%
364 189
 
4.3%
208 178
 
4.1%
260 163
 
3.7%
1 113
 
2.6%
Other values (48) 898
20.6%
(Missing) 733
16.8%
ValueCountFrequency (%)
0 270
20.2%
52 91
 
6.8%
12 78
 
5.8%
104 67
 
5.0%
24 67
 
5.0%
364 63
 
4.7%
1 59
 
4.4%
2 41
 
3.1%
156 35
 
2.6%
3 34
 
2.5%
Other values (27) 179
13.4%
(Missing) 355
26.5%
ValueCountFrequency (%)
0 473
10.8%
1 113
 
2.6%
2 104
 
2.4%
3 102
 
2.3%
4 61
 
1.4%
ValueCountFrequency (%)
0 270
20.2%
1 59
 
4.4%
2 41
 
3.1%
3 34
 
2.5%
4 20
 
1.5%
ValueCountFrequency (%)
0 270
6.2%
1 59
 
1.4%
2 41
 
0.9%
3 34
 
0.8%
4 20
 
0.5%
ValueCountFrequency (%)
0 473
35.3%
1 113
 
8.4%
2 104
 
7.8%
3 102
 
7.6%
4 61
 
4.6%

SmokeNow
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)0.1%0.3%
Missing2653596
Missing (%)60.8%44.5%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:18.534064image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length33
Median length23
Mean length2.3820093462.569313594
Min length22

Characters and Unicode

 Dataset ADataset B
Total characters4.0781.909
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowNoYes
2nd rowYesNo
3rd rowYesYes
4th rowNoNo
5th rowNoYes
ValueCountFrequency (%)
no 1058
61.8%
yes 654
38.2%
ValueCountFrequency (%)
yes 423
56.9%
no 320
43.1%
2025-08-30T19:31:18.619544image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 1058
25.9%
o 1058
25.9%
Y 654
16.0%
e 654
16.0%
s 654
16.0%
ValueCountFrequency (%)
Y 423
22.2%
e 423
22.2%
s 423
22.2%
N 320
16.8%
o 320
16.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4078
100.0%
ValueCountFrequency (%)
(unknown) 1909
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 1058
25.9%
o 1058
25.9%
Y 654
16.0%
e 654
16.0%
s 654
16.0%
ValueCountFrequency (%)
Y 423
22.2%
e 423
22.2%
s 423
22.2%
N 320
16.8%
o 320
16.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4078
100.0%
ValueCountFrequency (%)
(unknown) 1909
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 1058
25.9%
o 1058
25.9%
Y 654
16.0%
e 654
16.0%
s 654
16.0%
ValueCountFrequency (%)
Y 423
22.2%
e 423
22.2%
s 423
22.2%
N 320
16.8%
o 320
16.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4078
100.0%
ValueCountFrequency (%)
(unknown) 1909
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 1058
25.9%
o 1058
25.9%
Y 654
16.0%
e 654
16.0%
s 654
16.0%
ValueCountFrequency (%)
Y 423
22.2%
e 423
22.2%
s 423
22.2%
N 320
16.8%
o 320
16.8%

Smoke100
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)< 0.1%0.1%
Missing00
Missing (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:18.669178image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length33
Median length23
Mean length2.3922107672.55489171
Min length22

Characters and Unicode

 Dataset ADataset B
Total characters10.4423.421
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowNoNo
2nd rowNoYes
3rd rowYesNo
4th rowNoNo
5th rowNoYes
ValueCountFrequency (%)
no 2653
60.8%
yes 1712
39.2%
ValueCountFrequency (%)
yes 743
55.5%
no 596
44.5%
2025-08-30T19:31:18.756776image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 2653
25.4%
o 2653
25.4%
Y 1712
16.4%
e 1712
16.4%
s 1712
16.4%
ValueCountFrequency (%)
Y 743
21.7%
e 743
21.7%
s 743
21.7%
N 596
17.4%
o 596
17.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 10442
100.0%
ValueCountFrequency (%)
(unknown) 3421
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 2653
25.4%
o 2653
25.4%
Y 1712
16.4%
e 1712
16.4%
s 1712
16.4%
ValueCountFrequency (%)
Y 743
21.7%
e 743
21.7%
s 743
21.7%
N 596
17.4%
o 596
17.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 10442
100.0%
ValueCountFrequency (%)
(unknown) 3421
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 2653
25.4%
o 2653
25.4%
Y 1712
16.4%
e 1712
16.4%
s 1712
16.4%
ValueCountFrequency (%)
Y 743
21.7%
e 743
21.7%
s 743
21.7%
N 596
17.4%
o 596
17.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 10442
100.0%
ValueCountFrequency (%)
(unknown) 3421
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 2653
25.4%
o 2653
25.4%
Y 1712
16.4%
e 1712
16.4%
s 1712
16.4%
ValueCountFrequency (%)
Y 743
21.7%
e 743
21.7%
s 743
21.7%
N 596
17.4%
o 596
17.4%

Smoke100n
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)< 0.1%0.1%
Missing00
Missing (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:18.822919image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length1010
Median length106
Mean length8.431156937.780433159
Min length66

Characters and Unicode

 Dataset ADataset B
Total characters36.80210.418
Distinct characters99
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowNon-SmokerNon-Smoker
2nd rowNon-SmokerSmoker
3rd rowSmokerNon-Smoker
4th rowNon-SmokerNon-Smoker
5th rowNon-SmokerSmoker
ValueCountFrequency (%)
non-smoker 2653
60.8%
smoker 1712
39.2%
ValueCountFrequency (%)
smoker 743
55.5%
non-smoker 596
44.5%
2025-08-30T19:31:18.933769image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 7018
19.1%
S 4365
11.9%
m 4365
11.9%
k 4365
11.9%
e 4365
11.9%
r 4365
11.9%
N 2653
 
7.2%
n 2653
 
7.2%
- 2653
 
7.2%
ValueCountFrequency (%)
o 1935
18.6%
S 1339
12.9%
m 1339
12.9%
k 1339
12.9%
e 1339
12.9%
r 1339
12.9%
N 596
 
5.7%
n 596
 
5.7%
- 596
 
5.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 36802
100.0%
ValueCountFrequency (%)
(unknown) 10418
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 7018
19.1%
S 4365
11.9%
m 4365
11.9%
k 4365
11.9%
e 4365
11.9%
r 4365
11.9%
N 2653
 
7.2%
n 2653
 
7.2%
- 2653
 
7.2%
ValueCountFrequency (%)
o 1935
18.6%
S 1339
12.9%
m 1339
12.9%
k 1339
12.9%
e 1339
12.9%
r 1339
12.9%
N 596
 
5.7%
n 596
 
5.7%
- 596
 
5.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 36802
100.0%
ValueCountFrequency (%)
(unknown) 10418
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 7018
19.1%
S 4365
11.9%
m 4365
11.9%
k 4365
11.9%
e 4365
11.9%
r 4365
11.9%
N 2653
 
7.2%
n 2653
 
7.2%
- 2653
 
7.2%
ValueCountFrequency (%)
o 1935
18.6%
S 1339
12.9%
m 1339
12.9%
k 1339
12.9%
e 1339
12.9%
r 1339
12.9%
N 596
 
5.7%
n 596
 
5.7%
- 596
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 36802
100.0%
ValueCountFrequency (%)
(unknown) 10418
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 7018
19.1%
S 4365
11.9%
m 4365
11.9%
k 4365
11.9%
e 4365
11.9%
r 4365
11.9%
N 2653
 
7.2%
n 2653
 
7.2%
- 2653
 
7.2%
ValueCountFrequency (%)
o 1935
18.6%
S 1339
12.9%
m 1339
12.9%
k 1339
12.9%
e 1339
12.9%
r 1339
12.9%
N 596
 
5.7%
n 596
 
5.7%
- 596
 
5.7%

SmokeAge
Real number (ℝ)

 Dataset ADataset B
Distinct3436
Distinct (%)2.1%5.0%
Missing2744613
Missing (%)62.9%45.8%
Infinite00
Infinite (%)0.0%0.0%
Mean18.2486119717.19283747
 Dataset ADataset B
Minimum76
Maximum5072
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:18.980618image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum76
5-th percentile1310
Q11514
median1816
Q32018
95-th percentile2728.5
Maximum5072
Range4366
Interquartile range (IQR)54

Descriptive statistics

 Dataset ADataset B
Standard deviation4.702954776.548177188
Coefficient of variation (CV)0.25771575270.380866579
Kurtosis7.0281230515.72308138
Mean18.2486119717.19283747
Median Absolute Deviation (MAD)22
Skewness1.9738011123.255806362
Sum2958112482
Variance22.1177835742.87862449
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:19.038948image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
18 250
 
5.7%
16 189
 
4.3%
15 188
 
4.3%
19 146
 
3.3%
17 144
 
3.3%
20 128
 
2.9%
14 112
 
2.6%
21 70
 
1.6%
22 69
 
1.6%
13 66
 
1.5%
Other values (24) 259
 
5.9%
(Missing) 2744
62.9%
ValueCountFrequency (%)
16 107
 
8.0%
18 100
 
7.5%
15 99
 
7.4%
17 67
 
5.0%
14 65
 
4.9%
13 52
 
3.9%
12 40
 
3.0%
20 29
 
2.2%
19 22
 
1.6%
25 19
 
1.4%
Other values (26) 126
 
9.4%
(Missing) 613
45.8%
ValueCountFrequency (%)
7 3
 
0.1%
9 1
 
< 0.1%
10 13
 
0.3%
11 10
 
0.2%
12 45
1.0%
ValueCountFrequency (%)
6 4
 
0.3%
7 1
 
0.1%
8 3
 
0.2%
9 13
1.0%
10 19
1.4%
ValueCountFrequency (%)
6 4
 
0.1%
7 1
 
< 0.1%
8 3
 
0.1%
9 13
0.3%
10 19
0.4%
ValueCountFrequency (%)
7 3
 
0.2%
9 1
 
0.1%
10 13
 
1.0%
11 10
 
0.7%
12 45
3.4%

Marijuana
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)0.1%0.3%
Missing1374602
Missing (%)31.5%45.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:19.095990image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length33
Median length33
Mean length2.6011367442.519674355
Min length22

Characters and Unicode

 Dataset ADataset B
Total characters7.7801.857
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowYesYes
2nd rowYesNo
3rd rowNoNo
4th rowNoYes
5th rowYesYes
ValueCountFrequency (%)
yes 1798
60.1%
no 1193
39.9%
ValueCountFrequency (%)
yes 383
52.0%
no 354
48.0%
2025-08-30T19:31:19.183439image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Y 1798
23.1%
e 1798
23.1%
s 1798
23.1%
N 1193
15.3%
o 1193
15.3%
ValueCountFrequency (%)
Y 383
20.6%
e 383
20.6%
s 383
20.6%
N 354
19.1%
o 354
19.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 7780
100.0%
ValueCountFrequency (%)
(unknown) 1857
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
Y 1798
23.1%
e 1798
23.1%
s 1798
23.1%
N 1193
15.3%
o 1193
15.3%
ValueCountFrequency (%)
Y 383
20.6%
e 383
20.6%
s 383
20.6%
N 354
19.1%
o 354
19.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 7780
100.0%
ValueCountFrequency (%)
(unknown) 1857
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
Y 1798
23.1%
e 1798
23.1%
s 1798
23.1%
N 1193
15.3%
o 1193
15.3%
ValueCountFrequency (%)
Y 383
20.6%
e 383
20.6%
s 383
20.6%
N 354
19.1%
o 354
19.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 7780
100.0%
ValueCountFrequency (%)
(unknown) 1857
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
Y 1798
23.1%
e 1798
23.1%
s 1798
23.1%
N 1193
15.3%
o 1193
15.3%
ValueCountFrequency (%)
Y 383
20.6%
e 383
20.6%
s 383
20.6%
N 354
19.1%
o 354
19.1%

AgeFirstMarij
Real number (ℝ)

 Dataset ADataset B
Distinct3125
Distinct (%)1.7%6.5%
Missing2568956
Missing (%)58.8%71.4%
Infinite00
Infinite (%)0.0%0.0%
Mean17.4240400716.32375979
 Dataset ADataset B
Minimum17
Maximum4846
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:19.229303image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum17
5-th percentile1311
Q11514
median1716
Q31918
95-th percentile2425
Maximum4846
Range4739
Interquartile range (IQR)44

Descriptive statistics

 Dataset ADataset B
Standard deviation3.6796013774.261680526
Coefficient of variation (CV)0.21117957510.2610722395
Kurtosis10.923918996.513142938
Mean17.4240400716.32375979
Median Absolute Deviation (MAD)22
Skewness2.0461338281.603072708
Sum313116252
Variance13.539466318.16192091
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:19.282423image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
16 293
 
6.7%
18 282
 
6.5%
15 202
 
4.6%
17 182
 
4.2%
19 176
 
4.0%
14 140
 
3.2%
20 121
 
2.8%
13 81
 
1.9%
21 76
 
1.7%
12 64
 
1.5%
Other values (21) 180
 
4.1%
(Missing) 2568
58.8%
ValueCountFrequency (%)
15 56
 
4.2%
16 55
 
4.1%
18 39
 
2.9%
17 34
 
2.5%
13 33
 
2.5%
12 32
 
2.4%
14 31
 
2.3%
19 18
 
1.3%
20 15
 
1.1%
11 10
 
0.7%
Other values (15) 60
 
4.5%
(Missing) 956
71.4%
ValueCountFrequency (%)
1 1
 
< 0.1%
9 2
 
< 0.1%
10 6
 
0.1%
11 10
 
0.2%
12 64
1.5%
ValueCountFrequency (%)
7 2
 
0.1%
8 2
 
0.1%
9 3
 
0.2%
10 9
0.7%
11 10
0.7%
ValueCountFrequency (%)
7 2
 
< 0.1%
8 2
 
< 0.1%
9 3
 
0.1%
10 9
0.2%
11 10
0.2%
ValueCountFrequency (%)
1 1
 
0.1%
9 2
 
0.1%
10 6
 
0.4%
11 10
 
0.7%
12 64
4.8%

RegularMarij
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)0.1%0.3%
Missing1374602
Missing (%)31.5%45.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:19.332154image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length33
Median length22
Mean length2.2420595122.335142469
Min length22

Characters and Unicode

 Dataset ADataset B
Total characters6.7061.721
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowNoYes
2nd rowNoNo
3rd rowNoNo
4th rowNoYes
5th rowNoNo
ValueCountFrequency (%)
no 2267
75.8%
yes 724
 
24.2%
ValueCountFrequency (%)
no 490
66.5%
yes 247
33.5%
2025-08-30T19:31:19.411698image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 2267
33.8%
o 2267
33.8%
Y 724
 
10.8%
e 724
 
10.8%
s 724
 
10.8%
ValueCountFrequency (%)
N 490
28.5%
o 490
28.5%
Y 247
14.4%
e 247
14.4%
s 247
14.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 6706
100.0%
ValueCountFrequency (%)
(unknown) 1721
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 2267
33.8%
o 2267
33.8%
Y 724
 
10.8%
e 724
 
10.8%
s 724
 
10.8%
ValueCountFrequency (%)
N 490
28.5%
o 490
28.5%
Y 247
14.4%
e 247
14.4%
s 247
14.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 6706
100.0%
ValueCountFrequency (%)
(unknown) 1721
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 2267
33.8%
o 2267
33.8%
Y 724
 
10.8%
e 724
 
10.8%
s 724
 
10.8%
ValueCountFrequency (%)
N 490
28.5%
o 490
28.5%
Y 247
14.4%
e 247
14.4%
s 247
14.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 6706
100.0%
ValueCountFrequency (%)
(unknown) 1721
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 2267
33.8%
o 2267
33.8%
Y 724
 
10.8%
e 724
 
10.8%
s 724
 
10.8%
ValueCountFrequency (%)
N 490
28.5%
o 490
28.5%
Y 247
14.4%
e 247
14.4%
s 247
14.4%

AgeRegMarij
Real number (ℝ)

 Dataset ADataset B
Distinct2723
Distinct (%)3.7%9.3%
Missing36411092
Missing (%)83.4%81.6%
Infinite00
Infinite (%)0.0%0.0%
Mean18.0939226516.88259109
 Dataset ADataset B
Minimum118
Maximum5250
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:19.454954image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum118
5-th percentile1311.3
Q11614
median1716
Q32019
95-th percentile2525
Maximum5250
Range4142
Interquartile range (IQR)45

Descriptive statistics

 Dataset ADataset B
Standard deviation4.5859870244.774421662
Coefficient of variation (CV)0.253454550.2828014749
Kurtosis12.2446566213.83557993
Mean18.0939226516.88259109
Median Absolute Deviation (MAD)22
Skewness2.6972482062.496684371
Sum131004170
Variance21.0312769822.7951022
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:19.502819image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
16 123
 
2.8%
18 113
 
2.6%
15 81
 
1.9%
20 69
 
1.6%
17 69
 
1.6%
19 50
 
1.1%
21 36
 
0.8%
14 35
 
0.8%
13 34
 
0.8%
22 25
 
0.6%
Other values (17) 89
 
2.0%
(Missing) 3641
83.4%
ValueCountFrequency (%)
16 35
 
2.6%
18 27
 
2.0%
14 24
 
1.8%
17 23
 
1.7%
15 23
 
1.7%
12 20
 
1.5%
19 18
 
1.3%
20 17
 
1.3%
13 16
 
1.2%
25 10
 
0.7%
Other values (13) 34
 
2.5%
(Missing) 1092
81.6%
ValueCountFrequency (%)
11 3
 
0.1%
12 23
 
0.5%
13 34
0.8%
14 35
0.8%
15 81
1.9%
ValueCountFrequency (%)
8 3
 
0.2%
9 1
 
0.1%
10 7
 
0.5%
11 2
 
0.1%
12 20
1.5%
ValueCountFrequency (%)
8 3
 
0.1%
9 1
 
< 0.1%
10 7
 
0.2%
11 2
 
< 0.1%
12 20
0.5%
ValueCountFrequency (%)
11 3
 
0.2%
12 23
 
1.7%
13 34
2.5%
14 35
2.6%
15 81
6.0%

HardDrugs
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)0.1%0.2%
Missing835484
Missing (%)19.1%36.1%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:19.546735image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length33
Median length22
Mean length2.1756373942.222222222
Min length22

Characters and Unicode

 Dataset ADataset B
Total characters7.6801.900
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowYesNo
2nd rowNoNo
3rd rowNoNo
4th rowNoYes
5th rowNoNo
ValueCountFrequency (%)
no 2910
82.4%
yes 620
 
17.6%
ValueCountFrequency (%)
no 665
77.8%
yes 190
 
22.2%
2025-08-30T19:31:19.627800image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 2910
37.9%
o 2910
37.9%
Y 620
 
8.1%
e 620
 
8.1%
s 620
 
8.1%
ValueCountFrequency (%)
N 665
35.0%
o 665
35.0%
Y 190
 
10.0%
e 190
 
10.0%
s 190
 
10.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 7680
100.0%
ValueCountFrequency (%)
(unknown) 1900
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 2910
37.9%
o 2910
37.9%
Y 620
 
8.1%
e 620
 
8.1%
s 620
 
8.1%
ValueCountFrequency (%)
N 665
35.0%
o 665
35.0%
Y 190
 
10.0%
e 190
 
10.0%
s 190
 
10.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 7680
100.0%
ValueCountFrequency (%)
(unknown) 1900
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 2910
37.9%
o 2910
37.9%
Y 620
 
8.1%
e 620
 
8.1%
s 620
 
8.1%
ValueCountFrequency (%)
N 665
35.0%
o 665
35.0%
Y 190
 
10.0%
e 190
 
10.0%
s 190
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 7680
100.0%
ValueCountFrequency (%)
(unknown) 1900
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 2910
37.9%
o 2910
37.9%
Y 620
 
8.1%
e 620
 
8.1%
s 620
 
8.1%
ValueCountFrequency (%)
N 665
35.0%
o 665
35.0%
Y 190
 
10.0%
e 190
 
10.0%
s 190
 
10.0%

SexEver
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)0.1%0.2%
Missing837480
Missing (%)19.2%35.8%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:19.667862image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length33
Median length33
Mean length2.9662698412.95459837
Min length22

Characters and Unicode

 Dataset ADataset B
Total characters10.4652.538
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowYesYes
2nd rowYesYes
3rd rowYesYes
4th rowYesYes
5th rowYesYes
ValueCountFrequency (%)
yes 3409
96.6%
no 119
 
3.4%
ValueCountFrequency (%)
yes 820
95.5%
no 39
 
4.5%
2025-08-30T19:31:19.748043image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Y 3409
32.6%
e 3409
32.6%
s 3409
32.6%
N 119
 
1.1%
o 119
 
1.1%
ValueCountFrequency (%)
Y 820
32.3%
e 820
32.3%
s 820
32.3%
N 39
 
1.5%
o 39
 
1.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 10465
100.0%
ValueCountFrequency (%)
(unknown) 2538
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
Y 3409
32.6%
e 3409
32.6%
s 3409
32.6%
N 119
 
1.1%
o 119
 
1.1%
ValueCountFrequency (%)
Y 820
32.3%
e 820
32.3%
s 820
32.3%
N 39
 
1.5%
o 39
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 10465
100.0%
ValueCountFrequency (%)
(unknown) 2538
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
Y 3409
32.6%
e 3409
32.6%
s 3409
32.6%
N 119
 
1.1%
o 119
 
1.1%
ValueCountFrequency (%)
Y 820
32.3%
e 820
32.3%
s 820
32.3%
N 39
 
1.5%
o 39
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 10465
100.0%
ValueCountFrequency (%)
(unknown) 2538
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
Y 3409
32.6%
e 3409
32.6%
s 3409
32.6%
N 119
 
1.1%
o 119
 
1.1%
ValueCountFrequency (%)
Y 820
32.3%
e 820
32.3%
s 820
32.3%
N 39
 
1.5%
o 39
 
1.5%

SexAge
Real number (ℝ)

 Dataset ADataset B
Distinct3428
Distinct (%)1.0%3.4%
Missing959520
Missing (%)22.0%38.8%
Infinite00
Infinite (%)0.0%0.0%
Mean17.9486200816.36752137
 Dataset ADataset B
Minimum99
Maximum5039
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:19.794878image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum99
5-th percentile1312
Q11614
median1716
Q31918
95-th percentile2523
Maximum5039
Range4130
Interquartile range (IQR)34

Descriptive statistics

 Dataset ADataset B
Standard deviation3.7226144843.84262162
Coefficient of variation (CV)0.20740393780.2347711381
Kurtosis8.4560295556.716275225
Mean17.9486200816.36752137
Median Absolute Deviation (MAD)22
Skewness1.8786933481.878930775
Sum6113313405
Variance13.8578585914.76574091
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:19.848392image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
17 539
12.3%
16 530
12.1%
18 526
12.1%
19 341
 
7.8%
15 324
 
7.4%
20 239
 
5.5%
14 177
 
4.1%
21 131
 
3.0%
22 104
 
2.4%
13 89
 
2.0%
Other values (24) 406
9.3%
(Missing) 959
22.0%
ValueCountFrequency (%)
15 134
 
10.0%
16 113
 
8.4%
18 96
 
7.2%
14 89
 
6.6%
17 89
 
6.6%
13 75
 
5.6%
12 46
 
3.4%
20 38
 
2.8%
19 38
 
2.8%
21 19
 
1.4%
Other values (18) 82
 
6.1%
(Missing) 520
38.8%
ValueCountFrequency (%)
9 26
 
0.6%
10 5
 
0.1%
11 16
 
0.4%
12 49
1.1%
13 89
2.0%
ValueCountFrequency (%)
9 13
 
1.0%
10 9
 
0.7%
11 8
 
0.6%
12 46
3.4%
13 75
5.6%
ValueCountFrequency (%)
9 13
 
0.3%
10 9
 
0.2%
11 8
 
0.2%
12 46
1.1%
13 75
1.7%
ValueCountFrequency (%)
9 26
 
1.9%
10 5
 
0.4%
11 16
 
1.2%
12 49
3.7%
13 89
6.6%

SexNumPartnLife
Real number (ℝ)

 Dataset ADataset B
Distinct6657
Distinct (%)1.9%6.8%
Missing852501
Missing (%)19.5%37.4%
Infinite00
Infinite (%)0.0%0.0%
Mean12.4457728422.91885442
 Dataset ADataset B
Minimum00
Maximum9991000
Zeros13945
Zeros (%)3.2%3.4%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:19.912866image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile10
Q123
median57
Q31115
95-th percentile4080.75
Maximum9991000
Range9991000
Interquartile range (IQR)912

Descriptive statistics

 Dataset ADataset B
Standard deviation38.5016193581.33235267
Coefficient of variation (CV)3.0935499013.548709338
Kurtosis429.1728557103.5473532
Mean12.4457728422.91885442
Median Absolute Deviation (MAD)45
Skewness18.457064339.486362244
Sum4372219206
Variance1482.3746926614.951592
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:19.985078image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 515
11.8%
3 352
 
8.1%
5 343
 
7.9%
2 238
 
5.5%
6 236
 
5.4%
10 224
 
5.1%
4 194
 
4.4%
7 147
 
3.4%
0 139
 
3.2%
8 138
 
3.2%
Other values (56) 987
22.6%
(Missing) 852
19.5%
ValueCountFrequency (%)
1 80
 
6.0%
3 76
 
5.7%
5 73
 
5.5%
2 57
 
4.3%
10 52
 
3.9%
15 49
 
3.7%
0 45
 
3.4%
4 44
 
3.3%
20 44
 
3.3%
8 42
 
3.1%
Other values (47) 276
20.6%
(Missing) 501
37.4%
ValueCountFrequency (%)
0 139
 
3.2%
1 515
11.8%
2 238
5.5%
3 352
8.1%
4 194
 
4.4%
ValueCountFrequency (%)
0 45
3.4%
1 80
6.0%
2 57
4.3%
3 76
5.7%
4 44
3.3%
ValueCountFrequency (%)
0 45
1.0%
1 80
1.8%
2 57
1.3%
3 76
1.7%
4 44
1.0%
ValueCountFrequency (%)
0 139
 
10.4%
1 515
38.5%
2 238
17.8%
3 352
26.3%
4 194
 
14.5%

SexNumPartYear
Real number (ℝ)

 Dataset ADataset B
Distinct1714
Distinct (%)0.6%1.9%
Missing1382607
Missing (%)31.7%45.3%
Infinite00
Infinite (%)0.0%0.0%
Mean1.16158231.841530055
 Dataset ADataset B
Minimum00
Maximum2069
Zeros518133
Zeros (%)11.9%9.9%
Negative00
Negative (%)0.0%0.0%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:20.036242image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile00
Q111
median11
Q311
95-th percentile35
Maximum2069
Range2069
Interquartile range (IQR)00

Descriptive statistics

 Dataset ADataset B
Standard deviation1.5249014365.278251779
Coefficient of variation (CV)1.3127795052.866231678
Kurtosis63.1612364395.28194479
Mean1.16158231.841530055
Median Absolute Deviation (MAD)00
Skewness6.78090079.064146077
Sum34651348
Variance2.3253243927.85994184
MonotonicityNot monotonicNot monotonic
2025-08-30T19:31:20.078979image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
1 2084
47.7%
0 518
 
11.9%
2 209
 
4.8%
3 73
 
1.7%
5 31
 
0.7%
4 19
 
0.4%
8 14
 
0.3%
6 7
 
0.2%
10 6
 
0.1%
9 4
 
0.1%
Other values (7) 18
 
0.4%
(Missing) 1382
31.7%
ValueCountFrequency (%)
1 457
34.1%
0 133
 
9.9%
2 52
 
3.9%
3 45
 
3.4%
5 14
 
1.0%
6 7
 
0.5%
30 6
 
0.4%
4 4
 
0.3%
9 4
 
0.3%
50 2
 
0.1%
Other values (4) 8
 
0.6%
(Missing) 607
45.3%
ValueCountFrequency (%)
0 518
 
11.9%
1 2084
47.7%
2 209
 
4.8%
3 73
 
1.7%
4 19
 
0.4%
ValueCountFrequency (%)
0 133
 
9.9%
1 457
34.1%
2 52
 
3.9%
3 45
 
3.4%
4 4
 
0.3%
ValueCountFrequency (%)
0 133
 
3.0%
1 457
10.5%
2 52
 
1.2%
3 45
 
1.0%
4 4
 
0.1%
ValueCountFrequency (%)
0 518
 
38.7%
1 2084
155.6%
2 209
 
15.6%
3 73
 
5.5%
4 19
 
1.4%

SameSex
['Text', 'Text']

 Dataset ADataset B
Distinct22
Distinct (%)0.1%0.2%
Missing835481
Missing (%)19.1%35.9%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:20.117849image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length33
Median length22
Mean length2.0767705382.064102564
Min length22

Characters and Unicode

 Dataset ADataset B
Total characters7.3311.771
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowNoNo
2nd rowNoNo
3rd rowNoNo
4th rowNoNo
5th rowNoNo
ValueCountFrequency (%)
no 3259
92.3%
yes 271
 
7.7%
ValueCountFrequency (%)
no 803
93.6%
yes 55
 
6.4%
2025-08-30T19:31:20.191927image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 3259
44.5%
o 3259
44.5%
Y 271
 
3.7%
e 271
 
3.7%
s 271
 
3.7%
ValueCountFrequency (%)
N 803
45.3%
o 803
45.3%
Y 55
 
3.1%
e 55
 
3.1%
s 55
 
3.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 7331
100.0%
ValueCountFrequency (%)
(unknown) 1771
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 3259
44.5%
o 3259
44.5%
Y 271
 
3.7%
e 271
 
3.7%
s 271
 
3.7%
ValueCountFrequency (%)
N 803
45.3%
o 803
45.3%
Y 55
 
3.1%
e 55
 
3.1%
s 55
 
3.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 7331
100.0%
ValueCountFrequency (%)
(unknown) 1771
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 3259
44.5%
o 3259
44.5%
Y 271
 
3.7%
e 271
 
3.7%
s 271
 
3.7%
ValueCountFrequency (%)
N 803
45.3%
o 803
45.3%
Y 55
 
3.1%
e 55
 
3.1%
s 55
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 7331
100.0%
ValueCountFrequency (%)
(unknown) 1771
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 3259
44.5%
o 3259
44.5%
Y 271
 
3.7%
e 271
 
3.7%
s 271
 
3.7%
ValueCountFrequency (%)
N 803
45.3%
o 803
45.3%
Y 55
 
3.1%
e 55
 
3.1%
s 55
 
3.1%

SexOrientation
['Text', 'Text']

 Dataset ADataset B
Distinct33
Distinct (%)0.1%0.4%
Missing1406646
Missing (%)32.2%48.2%
Memory size197.2 KiB53.2 KiB
2025-08-30T19:31:20.262888image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length1212
Median length1212
Mean length11.8756336611.85858586
Min length88

Characters and Unicode

 Dataset ADataset B
Total characters35.1408.218
Distinct characters1313
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowHeterosexualHeterosexual
2nd rowHeterosexualHeterosexual
3rd rowHeterosexualHeterosexual
4th rowHeterosexualHeterosexual
5th rowHeterosexualHeterosexual
ValueCountFrequency (%)
heterosexual 2839
95.9%
bisexual 64
 
2.2%
homosexual 56
 
1.9%
ValueCountFrequency (%)
heterosexual 665
96.0%
bisexual 21
 
3.0%
homosexual 7
 
1.0%
2025-08-30T19:31:20.379060image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 8637
24.6%
s 2959
 
8.4%
x 2959
 
8.4%
u 2959
 
8.4%
a 2959
 
8.4%
l 2959
 
8.4%
o 2951
 
8.4%
H 2895
 
8.2%
t 2839
 
8.1%
r 2839
 
8.1%
Other values (3) 184
 
0.5%
ValueCountFrequency (%)
e 2023
24.6%
s 693
 
8.4%
x 693
 
8.4%
u 693
 
8.4%
a 693
 
8.4%
l 693
 
8.4%
o 679
 
8.3%
H 672
 
8.2%
t 665
 
8.1%
r 665
 
8.1%
Other values (3) 49
 
0.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 35140
100.0%
ValueCountFrequency (%)
(unknown) 8218
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 8637
24.6%
s 2959
 
8.4%
x 2959
 
8.4%
u 2959
 
8.4%
a 2959
 
8.4%
l 2959
 
8.4%
o 2951
 
8.4%
H 2895
 
8.2%
t 2839
 
8.1%
r 2839
 
8.1%
Other values (3) 184
 
0.5%
ValueCountFrequency (%)
e 2023
24.6%
s 693
 
8.4%
x 693
 
8.4%
u 693
 
8.4%
a 693
 
8.4%
l 693
 
8.4%
o 679
 
8.3%
H 672
 
8.2%
t 665
 
8.1%
r 665
 
8.1%
Other values (3) 49
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 35140
100.0%
ValueCountFrequency (%)
(unknown) 8218
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 8637
24.6%
s 2959
 
8.4%
x 2959
 
8.4%
u 2959
 
8.4%
a 2959
 
8.4%
l 2959
 
8.4%
o 2951
 
8.4%
H 2895
 
8.2%
t 2839
 
8.1%
r 2839
 
8.1%
Other values (3) 184
 
0.5%
ValueCountFrequency (%)
e 2023
24.6%
s 693
 
8.4%
x 693
 
8.4%
u 693
 
8.4%
a 693
 
8.4%
l 693
 
8.4%
o 679
 
8.3%
H 672
 
8.2%
t 665
 
8.1%
r 665
 
8.1%
Other values (3) 49
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 35140
100.0%
ValueCountFrequency (%)
(unknown) 8218
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 8637
24.6%
s 2959
 
8.4%
x 2959
 
8.4%
u 2959
 
8.4%
a 2959
 
8.4%
l 2959
 
8.4%
o 2951
 
8.4%
H 2895
 
8.2%
t 2839
 
8.1%
r 2839
 
8.1%
Other values (3) 184
 
0.5%
ValueCountFrequency (%)
e 2023
24.6%
s 693
 
8.4%
x 693
 
8.4%
u 693
 
8.4%
a 693
 
8.4%
l 693
 
8.4%
o 679
 
8.3%
H 672
 
8.2%
t 665
 
8.1%
r 665
 
8.1%
Other values (3) 49
 
0.6%